You’re a lab assistant for a multi-billion dollar drug company called Novartirosche. The company has just developed a new cognitive performance enhancing drug called drug.x that it expects will revolutionize the industry. To test the performance of the drug, the company recruited 1,000 participants to perform one of two cognitive tasks after having taken drug.x or a placebo sugar pill. Participants assigned to the ‘wordsearch’ task have to find 100 words in a jumbled list as fast as possible. Participants assigned to the ‘animals task’ have to name 20 different animals as quickly as possible. For each task, the a lab assistant recorded how long it took each participant, in seconds, to complete their assigned task. The results are stored in a tab-delimited text file at https://dl.dropboxusercontent.com/u/7618380/drug.txt
drug <- read.table("https://dl.dropboxusercontent.com/u/7618380/drug.txt")
# That code didn't work for some people -- I'm not sure why. If that doesn't work, download the file to your computer, then replace the https link above with the file's path on your compueter. For example:
#drug <- read.table(file = "/Users/nathaniel/desktop/drug.txt")
drug[1:5,]
## drug task sex.asdf time.s age phone.number id
## 677 drug.x animals female 171 37 9322672379 268
## 348 placebo wordsearch female 231 37 7180780308 128
## 429 placebo animals other 131 24 8508113795 331
## 385 placebo wordsearch male 230 24 1648030302 254
## 419 placebo animals female 130 41 6482193027 396
head(drug)
## drug task sex.asdf time.s age phone.number id
## 677 drug.x animals female 171 37 9322672379 268
## 348 placebo wordsearch female 231 37 7180780308 128
## 429 placebo animals other 131 24 8508113795 331
## 385 placebo wordsearch male 230 24 1648030302 254
## 419 placebo animals female 130 41 6482193027 396
## 263 placebo wordsearch female 231 15 2494861347 13
drug[50:60,]
## drug task sex.asdf time.s age phone.number id
## 462 placebo animals other 129 15 2753248327 111
## 407 placebo animals female 129 37 6592522210 694
## 100 placebo wordsearch male 230 26 3410544564 427
## 487 placebo animals other 130 31 8102985146 67
## 840 drug.x animals other 169 29 6430696567 192
## 727 drug.x animals female 168 32 3334892955 905
## 685 drug.x animals other 170 23 1231159931 564
## 996 drug.x animals male 171 23 5439840998 620
## 119 placebo wordsearch female 229 30 1807868391 878
## 585 drug.x wordsearch male 270 28 6975130923 276
## 317 placebo wordsearch male 231 28 2908492601 435
View(df)
summary(drug)
## drug task sex.asdf time.s age
## drug.x :500 animals :500 female:348 Min. :128 Min. :10.00
## placebo:500 wordsearch:500 male :348 1st Qu.:170 1st Qu.:25.00
## other :304 Median :200 Median :30.00
## Mean :200 Mean :30.01
## 3rd Qu.:230 3rd Qu.:35.00
## Max. :273 Max. :50.00
## phone.number id
## Min. :1.009e+09 Min. : 1.0
## 1st Qu.:3.102e+09 1st Qu.: 250.8
## Median :5.416e+09 Median : 500.5
## Mean :5.442e+09 Mean : 500.5
## 3rd Qu.:7.688e+09 3rd Qu.: 750.2
## Max. :9.998e+09 Max. :1000.0
drug[drug$id == 314,]
## drug task sex.asdf time.s age phone.number id
## 747 drug.x animals female 170 36 4925379726 314
names(drug)
## [1] "drug" "task" "sex.asdf" "time.s"
## [5] "age" "phone.number" "id"
names(drug)[3] <- "sex"
time.m <- drug$time.s / 60
drug.female <- subset(drug, sex == "female")
drug.male <- subset(drug, sex == "male")
mean(drug.female$drug == "placebo")
## [1] 0.5
mean(drug.female$time.s)
## [1] 202.0345
mean(drug.male$drug == "placebo")
## [1] 0.5
mean(drug.male$time.s)
## [1] 201.1983
drug.oldest <- drug[drug$age == max(drug$age),]
drug.youngest <- drug[drug$age == min(drug$age),]
table(drug$age)
##
## 10 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
## 2 1 2 5 4 5 6 15 13 18 28 29 38 41 52 44 60 56 69 49 47 52 58 48 47
## 36 37 38 39 40 41 42 43 44 45 47 48 49 50
## 41 38 25 26 22 17 13 10 6 7 3 1 1 1
In the next question, you need to change specific values of a vector based on some criteria. We leaned how to do this in Chapter 5. If you forgot how, here’s an example:
a <- c(1, 1, 1, 1, 2, 2, 2, 2)
a == 2
## [1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE
a[a == 2] <- 10
a
## [1] 1 1 1 1 10 10 10 10
drug$age[drug$age < 18] <- 18
table(drug$age)
##
## 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
## 40 13 18 28 29 38 41 52 44 60 56 69 49 47 52 58 48 47 41 38 25 26 22 17 13
## 43 44 45 47 48 49 50
## 10 6 7 3 1 1 1
heightweight <- read.table("https://dl.dropboxusercontent.com/u/7618380/moredata.txt", sep = "\t")
# Again, if the code doesn't work, download the file to your computer, then put the file path from your computer as the argument to read.table()
drug <- cbind(drug, heightweight)
with(drug, mean(time.s[drug == "placebo"]))
## [1] 210.03
with(drug, mean(time.s[drug == "drug.x"]))
## [1] 190.022
Response times with drug.x were, on average, lower than with the placebo. Therefore, drug.x appeared to help
with(drug, mean(time.s[task == "animals"]))
## [1] 162.022
with(drug, mean(time.s[task == "wordsearch"]))
## [1] 238.03
The wordsearch task was much harder than the animal naming task
with(drug, mean(time.s[task == "animals" & drug == "drug.x"]))
## [1] 170.0125
with(drug, mean(time.s[task == "animals" & drug == "placebo"]))
## [1] 130.06
with(drug, mean(time.s[task == "wordsearch" & drug == "drug.x"]))
## [1] 270.06
with(drug, mean(time.s[task == "wordsearch" & drug == "placebo"]))
## [1] 230.0225
drug.x appeared to lead to SLOWER response times in both tasks! Crazy!
with(drug, table(task, drug))
## drug
## task drug.x placebo
## animals 400 100
## wordsearch 100 400
Simpson’s paradox…again! People given drug x were mostly assigned to the easier task while people given the placebo were mostly assigned to the harder task. This is why people given drug x appeared to do better on average compared to the placebo. However, people given the placebo actually did better on BOTH tasks. Drug x sucks.