Homework 1

#Exercise 1.2 (Get to know your workspace.)

Answer the following questions:

What is the R workspace file on your operating system?

Right now I am working in R Studio Cloud, Public Health 251 D
workspace. It is our computing environment.

What is the file path to your R workspace file?

Using the getwd() command we get “/cloud/project”

What is the name of this workspace file?

Homework 1.Rmd.

When you launched, which R packages loaded?

Using the search() function we see the list of the attached packages.

What are the file paths to the loaded R packages?

We can use the function searchpaths() to see the paths of the packages.

List all the object in the current workspace. If there are none, create some data objects. Using one expression, remove all the objects in the current workspace.

I used the function objects() to see the ones existing (also ls()), it returned “character(0)”, so created the vector var with three values var<-(1,1,1) and then removed with rm(var).

getwd()

## [1] "/cloud/project/Homeworks Epi251"

search()

## [1] ".GlobalEnv"        "package:stats"     "package:graphics" 
## [4] "package:grDevices" "package:utils"     "package:datasets" 
## [7] "package:methods"   "Autoloads"         "package:base"

searchpaths()

## [1] ".GlobalEnv"                          
## [2] "/opt/R/3.6.0/lib/R/library/stats"    
## [3] "/opt/R/3.6.0/lib/R/library/graphics" 
## [4] "/opt/R/3.6.0/lib/R/library/grDevices"
## [5] "/opt/R/3.6.0/lib/R/library/utils"    
## [6] "/opt/R/3.6.0/lib/R/library/datasets" 
## [7] "/opt/R/3.6.0/lib/R/library/methods"  
## [8] "Autoloads"                           
## [9] "/opt/R/3.6.0/lib/R/library/base"

objects()

## character(0)

var<-c(1,1,1)
rm(var)

Exercise 1.3 (Get to know math operators.)

Using Table 1.1, explain in words, and use R to illustrate, the difference between modulus and integer divide

The difference between modulus and integer divide is that integer returns the quotient of the operation and modulus returns the remainder. For example:

(integer_divide<-5%/%2)

## [1] 2

#This returns how many times does 2 can fit inside 5, which is 2

(modulus<-5%%2)

## [1] 1

#This returns the remainder of the division, which is 1

#Exercise 1.4 (Calculating body mass index.)

BMI is an indicator of total body fat, which is related to the risk of disease and death. The score is valid for both men and women but it does have some limitations: it may overestimate body fat in athletes and others who have a muscular build, it may underestimate body fat in older persons and others who have lost muscle mass.

Calculate the BMI for a male with weight of 155 lb and height of 5 ft 7 in.

#In order to do this we need to change from lb to kg and ft to mts.

(kg<-(155/2.2))

## [1] 70.45455

(mts<-0.0254*(67)) #first we changed 5 ft 7 in to only inches

## [1] 1.7018

(bmi<-kg/(mts*mts))

## [1] 24.32719

#Execise 1.5 logarithms

In R, the log function is to the base e. Implement the following R code and study the graph:

curve(log(x), 0, 6)
abline(v = c(1, exp(1)), h = c(0, 1), lty = 2)

What kind of generalizations can you make about the natural logarithm and its base—the number e?

The natural logarithm function, if considered as a real-valued function of a real variable, is the inverse function of the exponential function.

#Exercise 1.6 (Risk and risk odds)

Risk (R)) is a probability bounded between 0 and 1. Odds is the following transformation of R/1-R.

library(ggplot2)
library(ggpubr)

## Loading required package: magrittr

par(mfrow=c(1,2))
  curve(x/(1-x), 0, 1)
  curve(log(x/(1-x)), 0, 1)

What kind of generalizations can you make about the
log (odds) as a transformation of risk?

Log odds are an alternate way of expressing probabilities (risks).

#Exercise 1.7 (HIV transmission probabilities)

Review Table 1.6 and answer the questions. For each non-blood transfusion transmission probability (per act risk) in Table 1.6, calculate the cumulative risk of being infected after one year (365 days) if one carries out the same act once daily for one year with an HIV-infected partner. Do these cumulative risks make intuitive sense? Why or why not?

exposures<-c("Needle-sharing injection-drug use (IDU)","Receptive anal intercourse (RAI)","Percutaneous needle stick (PNS)", "Receptive penile-vaginal intercourse (RPVI)", "Insertive anal intercourse (IAI)", "Insertive penile-vaginal intercourse (IPVI)", "Receptive oral intercourse on penis (ROI)", "Insertive oral intercourse with penis (IOI)")
risks<-as.numeric(c("67","50","30", "10", "6.5", "5", "1", "0.5"))
data<-data.frame(exposures, risks)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

mutate(data, p_exposure=risks/10000, one_p=1-(p_exposure), p_365=((one_p)^365), p_infected=1-(p_365))

##                                     exposures risks p_exposure   one_p
## 1     Needle-sharing injection-drug use (IDU)  67.0    0.00670 0.99330
## 2            Receptive anal intercourse (RAI)  50.0    0.00500 0.99500
## 3             Percutaneous needle stick (PNS)  30.0    0.00300 0.99700
## 4 Receptive penile-vaginal intercourse (RPVI)  10.0    0.00100 0.99900
## 5            Insertive anal intercourse (IAI)   6.5    0.00065 0.99935
## 6 Insertive penile-vaginal intercourse (IPVI)   5.0    0.00050 0.99950
## 7   Receptive oral intercourse on penis (ROI)   1.0    0.00010 0.99990
## 8 Insertive oral intercourse with penis (IOI)   0.5    0.00005 0.99995
##        p_365 p_infected
## 1 0.08597238 0.91402762
## 2 0.16048131 0.83951869
## 3 0.33398948 0.66601052
## 4 0.69406989 0.30593011
## 5 0.78873322 0.21126678
## 6 0.83314662 0.16685338
## 7 0.96415633 0.03584367
## 8 0.98191507 0.01808493

#The cumulative risks make sense because they tend to one because repeating everyday the exposure will entail higher cumulative probability.

#Exercise 1.8 (Sourcing files and sinking log files)

The source function in R is used to “source” (read in) ASCII text files. Take a group of R commands that worked from a previous problem above and paste them into an ASCII text file and save it with the name job01.R. Then from R command line, source the file.

source('/cloud/project/Homeworks Epi251/job01.R')
source('/cloud/project/Homeworks Epi251/job01.R', echo=TRUE)

## 
## > (sum = 1 + 2)
## [1] 3

# Sources a script in a different file. When we use echo=TRUE, the job01.R file, even though we are sourcing a file that is closed, the output is printed in the console

sink("/cloud/project/Homeworks Epi251/job01.log1a")
source("/cloud/project/Homeworks Epi251/job01.R")
sink() 

sink("/cloud/project/Homeworks Epi251/job01.log1b")
source("/cloud/project/Homeworks Epi251/job01.R", echo=TRUE)

## 
## > (sum = 1 + 2)
## [1] 3

sink() 

#For the .log1a it just sources the file and for the log1b it sources it and echoes the information in jobs01.

source('/cloud/project/job02.R')

## [1] 0.01808493 0.03584367 0.16685338 0.21126678 0.30593011 0.66601052
## [7] 0.83951869 0.91402762

#It sourced the program, ran it and showed the results, just as we did with the echo argument.

Homework 1

Alma Juarez

7/9/2019

Exercise 1.3 (Get to know math operators.)