The datafile has 100 rows and 5 columns. Here are the columns
sex: A string indicting the sex of the participant. “m” = male, “f” = female.
age: An integer indicating the age of the participant.
prime: A string indicating the priming condition the participant was in. “elderly” = elderly prime words , “neutral” = neutral prime words
time: A number indicating how many seconds it took the participant to walk through the hallway in seconds
donate: An integer indicating whether or not the participant was willing to donate to a charity for elderly homeless people. 1 indicates that they did donate, 0 indicates no donation.
favorite.number: An integer indicating the person’s favorite number between 0 and 100. I don’t know why this question was asked…
A. Open your WPA.RProject (the one you created last week) and open a new script. Save the script with the name WPA5.R.
B. Using read.table(), load the tab-delimited text file containing the data into R from http://nathanieldphillips.com/wp-content/uploads/2016/04/priming-5.txt and assign it to a new object called elderly. Make sure to specify that the file is tab-delimited with the argument sep = \t and contains a header with the argument header = T.
C. Using write.table(), save the data as a text file called elderly.txt into the data folder in your working directory. That way you’ll always have access to the data even if it’s deleted from the website you downloaded it from.
D. Look at the first few rows of the dataframe with the head() function to make sure it looks ok.
E. Using the summary() function, look at summary statistics for each column in the dataframe. Make sure everything looks ok.
Chi-square: X(df) = XXX, p = YYY
t-test: t(df) = XXX, p = YYY
correlation test: r = XXX, t(df) = YYY, p = ZZZZ
Was the overall time that people took to walk down the hallway significantly different from 10 seconds? Answer this with a one-sample t-test.
Did men and women take a significantly different amount of time to walk down the hallway? Answer this with a two-sample t-test.
Was there a significant relationship between age and walking time? Answer this with a correlation test.
Was there a significant relationship between age and a person’s favorite number? Answer this with a correlation test.
Did men and women differ in how likely they were to donate? Answer this with a two-sample chi-square test.
Where there significantly different numbers of men versus women? Answer this with a one-sample chi-square test.
Was the average age of the participants significantly different from 100?
Were men significantly more or less likely to be assigned to the elderly prime condition compared to women?
Was there a significant effect of the prime on walking time?
Only for participants in the elderly prime condition, was there a relationship between age and walking time?
Only for men younger than 22, was there a significant effect of the prime on the likelihood of donating?
Only for women who made a donation, was there a significant effect of the prime on time?
In words, interpret the p-value that you got in Question 11.
Add a new column to the elderly dataframe called y.p that has a correlation of +1 with age. Confirm that the correlation is +1 using cor.test()
Add a new column to the elderly dataframe called y.n that has a correlation of -1 with favorite.number. Confirm that the correlation is +1 using cor.test()
The rnorm(n, mean, sd) function allows you to generate random data from a normal distribution with a specified mean and standard deviation. You can use the rnorm() function to add random noise to an existing variable. For example, the following code will generate (and plot), two variables y.1 and y.2 that are correlated with an independent variable x. y.1 is slightly positively correlated with x, and y.2 is highly negatively correlated with x.
# Create x, our independent variable
x <- rnorm(n = 100, mean = 10, sd = 1)
# Create y.1, slightly positively correlated with x
y.1 <- x + rnorm(n = 100, mean = 5, sd = 4)
# Create y.2, highly negatively correlated with x
y.2 <- -1 * x + rnorm(n = 100, mean = 13, sd = .3)
# Plot data
plot(1,
xlim = c(5, 15),
ylim = c(0, 25),
ylab = "y",
xlab = "x",
type = "n")
points(x, y.1, pch = 21, bg = "red", col = "white")
points(x, y.2, pch = 21, bg = "blue", col = "white")
# Add labels
text(6, 15, paste("cor(x, y.1) = ", round(cor(x, y.1), 2), sep = ""))
text(6, 2.5, paste("cor(x, y.2) = ", round(cor(x, y.2), 2), sep = ""))
# Add legend
legend("topright",
legend = c("y.1", "y.2"),
pch = 16,
col = c("red", "blue"),
bty = "n")
Using the rnorm() function, add a new column to the elderly dataframe that has a correlation somewhere between 0.30 and 0.60 with age.
Create the following plot