The datafile has 100 rows and 5 columns. Here are the columns
sex: A string indicting the sex of the participant. “m” = male, “f” = female.
age: An integer indicating the age of the participant.
prime: A string indicating the priming condition the participant was in. “elderly” = elderly prime words , “neutral” = neutral prime words
time: A number indicating how many seconds it took the participant to walk through the hallway in seconds
donate: An integer indicating whether or not the participant was willing to donate to a charity for elderly homeless people. 1 indicates that they did donate, 0 indicates no donation.
favorite.number: An integer indicating the person’s favorite number between 0 and 100. I don’t know why this question was asked…
A. Open your WPA.RProject (the one you created last week) and open a new script. Save the script with the name WPA5.R.
B. Using read.table(), load the tab-delimited text file containing the data into R from http://nathanieldphillips.com/wp-content/uploads/2016/04/priming-5.txt and assign it to a new object called elderly. Make sure to specify that the file is tab-delimited with the argument sep = \t and contains a header with the argument header = T.
elderly <- read.table("http://nathanieldphillips.com/wp-content/uploads/2016/04/priming-5.txt",
sep = "\t",
header = T
)
C. Using write.table(), save the data as a text file called elderly.txt into the data folder in your working directory. That way you’ll always have access to the data even if it’s deleted from the website you downloaded it from.
write.table(elderly,
file = "data/elderly.txt",
sep = "\t")
D. Look at the first few rows of the dataframe with the head() function to make sure it looks ok.
head(elderly)
## sex age prime time donate favorite.number
## 1 f 21 elderly 12.9 0 75
## 2 m 25 elderly 10.7 0 80
## 3 m 21 elderly 10.5 1 71
## 4 m 25 elderly 8.6 0 79
## 5 f 22 elderly 11.5 1 66
## 6 f 22 elderly 10.6 1 69
E. Using the summary() function, look at summary statistics for each column in the dataframe. Make sure everything looks ok.
summary(elderly)
## sex age prime time donate
## f:45 Min. :17.00 elderly:50 Min. : 7.50 Min. :0.00
## m:55 1st Qu.:21.00 neutral:50 1st Qu.: 9.50 1st Qu.:0.00
## Median :22.00 Median :10.25 Median :0.00
## Mean :21.98 Mean :10.21 Mean :0.48
## 3rd Qu.:23.00 3rd Qu.:10.90 3rd Qu.:1.00
## Max. :28.00 Max. :12.90 Max. :1.00
## favorite.number
## Min. :56.00
## 1st Qu.:68.00
## Median :71.50
## Mean :71.51
## 3rd Qu.:75.00
## Max. :86.00
Chi-square: X(df) = XXX, p = YYY
t-test: t(df) = XXX, p = YYY
correlation test: r = XXX, t(df) = YYY, p = ZZZZ
t.test(x = elderly$time,
mu = 10)
##
## One Sample t-test
##
## data: elderly$time
## t = 1.9545, df = 99, p-value = 0.05346
## alternative hypothesis: true mean is not equal to 10
## 95 percent confidence interval:
## 9.996886 10.413114
## sample estimates:
## mean of x
## 10.205
Answer: No, the amount of time people took to walk down the hallway was not significantly different from 0 (t(99) = 1.95, p = .05). That said, this was pretty close to .05 so most researchers would probably say it was significant
t.test(formula = time ~ sex,
data = elderly)
##
## Welch Two Sample t-test
##
## data: time by sex
## t = 1.0728, df = 87.57, p-value = 0.2863
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1954849 0.6540707
## sample estimates:
## mean in group f mean in group m
## 10.33111 10.10182
Answer: No, men and women did not have significantly different walking times (t(87.57) = 1.07, p = .29).
cor.test(formula = ~ time + age,
data = elderly)
##
## Pearson's product-moment correlation
##
## data: time and age
## t = -0.72612, df = 98, p-value = 0.4695
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.2657526 0.1250622
## sample estimates:
## cor
## -0.07315292
Answer: No, there was not a significant relationship between age and walking time (r = -0.07, t(98) = -0.76, p = .47).
cor.test(formula = ~ age + favorite.number,
data = elderly)
##
## Pearson's product-moment correlation
##
## data: age and favorite.number
## t = 5.1102, df = 98, p-value = 1.59e-06
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2882498 0.6009704
## sample estimates:
## cor
## 0.4586976
Answer: Yes. There was a significant positive correlation between a person’s name and their favorite number (r = 0.46, t(98) = 5.11, p < .01).
chisq.test(table(elderly$sex, elderly$donate))
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: table(elderly$sex, elderly$donate)
## X-squared = 0.0016188, df = 1, p-value = 0.9679
Answer: No, men and women were equally likely to donate (X(1) = 0.002, p = 0.97)
chisq.test(table(elderly$sex))
##
## Chi-squared test for given probabilities
##
## data: table(elderly$sex)
## X-squared = 1, df = 1, p-value = 0.3173
Answer: No, there wer not significantly different numbers of men versus women (X(1) = 1.00, p = 0.32)
t.test(x = elderly$age,
mu = 100
)
##
## One Sample t-test
##
## data: elderly$age
## t = -355.08, df = 99, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 100
## 95 percent confidence interval:
## 21.54402 22.41598
## sample estimates:
## mean of x
## 21.98
Answer: Yes, the average age of participants was significantly less than 100 (t(99) = -355.01, p < .01)
chisq.test(table(elderly$sex,
elderly$prime))
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: table(elderly$sex, elderly$prime)
## X-squared = 0, df = 1, p-value = 1
Answer: No, men and women were both equally likely to be assigned to the elderly prime condition (X(1) = 0, p = 1)
t.test(formula = time ~ prime,
data = elderly)
##
## Welch Two Sample t-test
##
## data: time by prime
## t = 4.6412, df = 97.576, p-value = 1.079e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.5071494 1.2648506
## sample estimates:
## mean in group elderly mean in group neutral
## 10.648 9.762
Answer: Yes, people given the elderly prime took a significantly longer time to walk down the hallway than those given the neutral prime (t(97.58) = 4.64, p < .01)
cor.test(formula = ~ age + time,
data = elderly,
subset = prime == "elderly"
)
##
## Pearson's product-moment correlation
##
## data: age and time
## t = -1.1908, df = 48, p-value = 0.2396
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.4275787 0.1143481
## sample estimates:
## cor
## -0.1693911
Answer: No, people in the elderly condition did not show a significant relationship between age and walking time (t(48) = -1.19, p = 0.24)
elderly.2 <- subset(elderly, sex == "m" & age < 22)
chisq.test(table(elderly.2$prime, elderly.2$donate))
## Warning in chisq.test(table(elderly.2$prime, elderly.2$donate)): Chi-
## squared approximation may be incorrect
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: table(elderly.2$prime, elderly.2$donate)
## X-squared = 0, df = 1, p-value = 1
Answer: No, men who were younger than 22 did not show a significant effect of prime on the likelihood of donating (X(1) = 1, p = 1). Note that we received an error from R because the amount of data we tested was so small!
t.test(formula = time ~ prime,
data = elderly,
subset = sex == "f" & donate == 1
)
##
## Welch Two Sample t-test
##
## data: time by prime
## t = 3.2929, df = 16.849, p-value = 0.004337
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.3997131 1.8280647
## sample estimates:
## mean in group elderly mean in group neutral
## 10.891667 9.777778
Answer: Yes, of the women who made a donation, those given an elderly prime took significantly longer to walk down the hallway compared to those given the neutral prime (t(16.85) = 3.29, p < .01
Answer: Assuming the null hypothesis is correct – that men younger than 22 are equally likely to donate no matter which prime they get – the probability of getting a test statistic as or more extreme than the one we got was 1.00
# There are infinitely many ways to do this
# Here are a few
elderly$y.p <- elderly$age * 2
elderly$y.p <- elderly$age * 2 - 50
elderly$y.p <- elderly$age / 10
# ...
cor.test(formula = ~ y.p + age, data = elderly)
##
## Pearson's product-moment correlation
##
## data: y.p and age
## t = Inf, df = 98, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 1 1
## sample estimates:
## cor
## 1
# Again, there are an infinite number of ways to do this
elderly$y.n <- -2 * elderly$age
elderly$y.n <- -10 * elderly$age - 22
# ...
cor.test(formula = ~ y.n + age,
data = elderly)
##
## Pearson's product-moment correlation
##
## data: y.n and age
## t = NaN, df = 98, p-value = NA
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## NaN NaN
## sample estimates:
## cor
## -1
# Create x, our independent variable
x <- rnorm(n = 100, mean = 10, sd = 1)
# Create y.1, slightly positively correlated with x
y.1 <- x + rnorm(n = 100, mean = 5, sd = 4)
# Create y.2, highly negatively correlated with x
y.2 <- -1 * x + rnorm(n = 100, mean = 13, sd = .3)
# Plot data
plot(1,
xlim = c(5, 15),
ylim = c(0, 25),
ylab = "y",
xlab = "x",
type = "n")
points(x, y.1, pch = 21, bg = "red", col = "white")
points(x, y.2, pch = 21, bg = "blue", col = "white")
# Add labels
text(6, 15, paste("cor(x, y.1) = ", round(cor(x, y.1), 2), sep = ""))
text(6, 2.5, paste("cor(x, y.2) = ", round(cor(x, y.2), 2), sep = ""))
# Add legend
legend("topright",
legend = c("y.1", "y.2"),
pch = 16,
col = c("red", "blue"),
bty = "n")
Using the rnorm() function, add a new column to the elderly dataframe that has a correlation somewhere between 0.30 and 0.60 with age.
Create the following plot
library(yarrr)
# Create elderly and neutral data frames
elderly.elderly <- subset(elderly, prime == "elderly")
elderly.neutral <- subset(elderly, prime == "neutral")
# Create the plot with elderly data
plot(x = elderly.elderly$age,
y = elderly.elderly$time,
xlim = c(10, 30),
ylim = c(5, 15),
pch = 16,
col = transparent("red", .8),
cex = 1,
xlab = "Age",
ylab = "Time",
main = "Age and Walking Times\nElderly vs. Neutral Prime"
)
# Add mean line for elderly data
abline(h = mean(elderly.elderly$time),
col = transparent("red", .8), lwd = 5)
# Add text above mean line
text(12, mean(elderly.elderly$time),
labels = paste("Mean = ", round(mean(elderly.elderly$time), 2), sep = ""),
pos = 3
)
# Add points for neutral prime data
points(x = elderly.neutral$age,
y = elderly.neutral$time,
pch = 16,
col = transparent("blue", .8),
cex = 1
)
# Add mean line
abline(h = mean(elderly.neutral$time),
col = transparent("blue", .8), lwd = 5)
# Add text
text(12, mean(elderly.neutral$time),
labels = paste("Mean = ", round(mean(elderly.neutral$time), 2), sep = ""),
pos = 1)
# Run t.test
my.test <- t.test(formula = time ~ prime, data = elderly)
# Add text of apa format of t.test
text(10, 6, labels = apa(my.test), cex = .7, adj = 0)
# Add legend
legend("topleft",
legend = c("Prime = Elderly", "Prime = Neutral"),
pch = c(16, 16),
col = c(transparent("red", .8), transparent("blue", .8)),
bty = "n"
)