1. Run a program (filename) using the 'source' command;
source("filename1.R", echo = TRUE)
##
## > Status <- rep(c("Dead", "Survived"), 4)
##
## > Treatment <- rep(c("Tolbutamide", "Placebo"), c(2,
## + 2), 2)
##
## > Agegrp <- rep(c("<55", "55+"), c(4, 4))
##
## > Freq <- c(8, 98, 5, 115, 22, 76, 16, 69)
##
## > dat <- data.frame(Status, Treatment, Agegrp, Freq)
##
## > dat
## Status Treatment Agegrp Freq
## 1 Dead Tolbutamide <55 8
## 2 Survived Placebo <55 98
## 3 Dead Tolbutamide <55 5
## 4 Survived Placebo <55 115
## 5 Dead Tolbutamide 55+ 22
## 6 Survived Placebo 55+ 76
## 7 Dead Tolbutamide 55+ 16
## 8 Survived Placebo 55+ 69
2. Demonstrate reading an ASCII data file (filename2.dat) to creat a 'data frame';
oswego <- read.csv("http://medepi.net/data/oswego/oswego2.csv", header = TRUE)
anthrax <- read.csv("http://www.medepi.net/data/anthrax_labs.txt", header = TRUE)
head(oswego)
## ID Age Sex MealDate MealTime Ill OnsetDate TimeOnset Baked.Ham Spinach
## 1 2 52 F 04/18/1940 20:00:00 Y 4/19/1940 00:30:00 Y Y
## 2 3 65 M 04/18/1940 18:30:00 Y 4/19/1940 00:30:00 Y Y
## 3 4 59 F 04/18/1940 18:30:00 Y 4/19/1940 00:30:00 Y Y
## 4 6 63 F 04/18/1940 19:30:00 Y 4/18/1940 22:30:00 Y Y
## 5 7 70 M 04/18/1940 19:30:00 Y 4/18/1940 22:30:00 Y Y
## 6 8 40 F 04/18/1940 19:30:00 Y 4/19/1940 02:00:00 N N
## Mashed.Potatoes Cabbage.Salad Jello Rolls Brown.Bread Milk Coffee Water
## 1 Y N N Y N N Y N
## 2 Y Y N N N N Y N
## 3 N N N N N N Y N
## 4 N Y Y N N N N Y
## 5 Y N Y Y Y N Y Y
## 6 N N N N N N N N
## Cakes Vanilla.Ice.Cream Chocolate.Ice.Cream Fruit.Salad
## 1 N Y N N
## 2 N Y Y N
## 3 Y Y Y N
## 4 N Y N N
## 5 N Y N N
## 6 N Y Y N
head(anthrax)
## labid caseid lab.date site sample test result
## 1 101 1 10/19/2001 Blood Serum IgG Positive
## 2 102 2 10/12/2001 Skin Biopsy IHC Positive
## 3 103 2 10/12/2001 Blood Serum IgG Positive
## 4 104 3 10/18/2001 Blood Serum IgG Positive
## 5 105 4 10/15/2001 Pleural Biopsy IHC Positive
## 6 106 4 10/15/2001 Blood Serum IgG Positive
3. Demonstrate simple data manipulation (e.g., variable transformation, recoding, etc.);
oswego$Coffee <- as.character(oswego$Coffee)
oswego$Coffee[oswego$Coffee == "N"] <- "Abstainer"
oswego$Coffee[oswego$Coffee == "Y"] <- "Caffeinator"
oswego$Coffee
## [1] "Caffeinator" "Caffeinator" "Caffeinator" "Abstainer" "Caffeinator"
## [6] "Abstainer" "Abstainer" "Abstainer" "Abstainer" "Caffeinator"
## [11] "Abstainer" "Abstainer" "Caffeinator" "Abstainer" "Abstainer"
## [16] "Abstainer" "Abstainer" "Abstainer" "Caffeinator" "Caffeinator"
## [21] "Abstainer" "Abstainer" "Caffeinator" "Caffeinator" "Abstainer"
## [26] "Caffeinator" "Abstainer" "Caffeinator" "Caffeinator" "Abstainer"
## [31] "Abstainer" "Caffeinator" "Abstainer" "Caffeinator" "Abstainer"
## [36] "Caffeinator" "Abstainer" "Abstainer" "Caffeinator" "Abstainer"
## [41] "Abstainer" "Abstainer" "Abstainer" "Abstainer" "Caffeinator"
## [46] "Caffeinator" "Abstainer" "Abstainer" "Abstainer" "Caffeinator"
## [51] "Abstainer" "Caffeinator" "Abstainer" "Abstainer" "Caffeinator"
## [56] "Caffeinator" "Abstainer" "Caffeinator" "Caffeinator" "Caffeinator"
## [61] "Caffeinator" "Abstainer" "Abstainer" "Abstainer" "Caffeinator"
## [66] "Abstainer" "Abstainer" "Abstainer" "Caffeinator" "Abstainer"
## [71] "Abstainer" "Caffeinator" "Caffeinator" "Abstainer" "Abstainer"
oswego$Coffee[oswego$Coffee == "Abstainer"] <- 0
oswego$Coffee[oswego$Coffee == "Caffeinator"] <- 1
oswego$Coffee
## [1] "1" "1" "1" "0" "1" "0" "0" "0" "0" "1" "0" "0" "1" "0" "0" "0" "0"
## [18] "0" "1" "1" "0" "0" "1" "1" "0" "1" "0" "1" "1" "0" "0" "1" "0" "1"
## [35] "0" "1" "0" "0" "1" "0" "0" "0" "0" "0" "1" "1" "0" "0" "0" "1" "0"
## [52] "1" "0" "0" "1" "1" "0" "1" "1" "1" "1" "0" "0" "0" "1" "0" "0" "0"
## [69] "1" "0" "0" "1" "1" "0" "0"
4. Demonstrate the use of calendar and Julian dates;
In a study of posttraumatic stress, I am interested in calcuating the number of days since September 11, 2001.
calendar <- c("2/2/2002", "5/8/2002", "9/2/2002", "4/3/2002", "8/29/2002", "10/31/2001")
calendar
## [1] "2/2/2002" "5/8/2002" "9/2/2002" "4/3/2002" "8/29/2002"
## [6] "10/31/2001"
jdate <- as.Date(calendar, format = "%m/%d/%Y")
jdate
## [1] "2002-02-02" "2002-05-08" "2002-09-02" "2002-04-03" "2002-08-29"
## [6] "2001-10-31"
sept11 <- as.Date("2001-11-09")
posttrauma <- (jdate - sept11)
posttrauma
## Time differences in days
## [1] 85 180 297 145 293 -9
5. Conduct a simple analysis using existing functions (from R, colleagues, etc.);
group1 <- c(8, 11, 6, 15, 8, 6, 2, 12, 19, 9, 10, 7)
group1
## [1] 8 11 6 15 8 6 2 12 19 9 10 7
group2 <- c(19, 10, 22, 25, 9, 15, 18, 16, 27)
group2
## [1] 19 10 22 25 9 15 18 16 27
t.test(group1, group2)
##
## Welch Two Sample t-test
##
## data: group1 and group2
## t = -3.486, df = 13.98, p-value = 0.003644
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -13.686 -3.258
## sample estimates:
## mean of x mean of y
## 9.417 17.889
6. Conduct a simple analyis demonstrating simple programming (e.g., a 'for' loop);
I have a t-value, and I would like to conduct bootstrapping with replacement to determine the sample of 10,000 t-values that I would later use to calculate a bootstrapped confidence interval.
group1 <- c(8, 11, 6, 15, 8, 6, 2, 12, 19, 9, 10, 7)
group2 <- c(19, 10, 22, 25, 9, 15, 18, 16, 27)
S = 10000
t.values = numeric(S)
for (i in 1:S) {
g1 = sample(group1, size = 12, replace = T)
g2 = sample(group2, size = 9, replace = T)
t.values[i] = t.test(g1, g2)$statistic
}
head(t.values)
## [1] -5.360 -4.302 -4.675 -1.926 -3.911 -1.299
7. Conduct a simple analysis demonstrating an original function created by student.
It want a function to convert Celcius to Farenheit when I am watching the weather and practicing R while traveling abroad some day.
tempconv <- function(x) {
temperature <- x * 9/5 + 32
return(temperature)
}
tempconv(0)
## [1] 32
tempconv(37.7778)
## [1] 100
8. Create a simple graph with title axes labels, and legend;
I am interested in plotting HIV Testing Latency (i.e., length time to HIV testing) as a function of Spiritual Beliefs among African American MSM.
setwd("/Users/wvincent/Desktop")
mp <- read.csv("mp.csv", header = TRUE)
plot(mp$hivlong ~ mp$spirbel, main = "HIV Testing Latency vs Spiritual Beliefs Among AA MSM",
xlab = "HIV Testing Latency", ylab = "Spiritual Beliefs")
legend(5, 23, c("Adj. R^2 = 0.02", "b = -0.19", "SE = 0.11", "CI.95 (-1.67, 0.10)"))
b <- lm(mp$hivlong ~ mp$spirbel)
abline(b)
summary(b)
##
## Call:
## lm(formula = mp$hivlong ~ mp$spirbel)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.17 -3.25 -1.57 1.68 19.33
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.806 2.199 4.00 0.00012 ***
## mp$spirbel -0.188 0.113 -1.67 0.09892 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.28 on 98 degrees of freedom
## Multiple R-squared: 0.0275, Adjusted R-squared: 0.0176
## F-statistic: 2.78 on 1 and 98 DF, p-value: 0.0989
9. Demonstrate the use of a regular expression;
I am only intrested in cases of infection with one class of exposure to HIV: infection from birth (i.e., “pediatric cases” not resulting from blood transfusions) that might still exist.
exposure <- c("IDU", "RUAI", "RUVI", "BTRAN", "PED", "MEDJOB", "ASSAULT", "IDU",
"RUAI", "RUVI", "BTRAN", "PED", "MEDJOB", "ASSAULT", "IDU", "RUAI", "RUVI",
"BTRAN", "PED", "MEDJOB", "ASSAULT", "IDU", "RUAI", "RUVI", "BTRAN", "PED",
"MEDJOB", "ASSAULT")
grep("^P", exposure)
## [1] 5 12 19 26
10. Demonstrate the use of the sink function to generate an output file;
sink("Sink Demo.txt")
print("This problem is 'sinking' down to the desktop.")
## [1] "This problem is 'sinking' down to the desktop."
calendar <- c("2/2/2002", "5/8/2002", "9/2/2002", "4/3/2002", "8/29/2002", "10/31/2001")
calendar
## [1] "2/2/2002" "5/8/2002" "9/2/2002" "4/3/2002" "8/29/2002"
## [6] "10/31/2001"
jdate <- as.Date(calendar, format = "%m/%d/%Y")
jdate
## [1] "2002-02-02" "2002-05-08" "2002-09-02" "2002-04-03" "2002-08-29"
## [6] "2001-10-31"
sept11 <- as.Date("2001-09-11")
posttrauma <- (jdate - sept11)
posttrauma
## Time differences in days
## [1] 144 239 356 204 352 50
sink()