Datafile description

The datafile has 100 rows and 5 columns. Here are the columns

sex: A string indicting the sex of the participant. “m” = male, “f” = female.
age: An integer indicating the age of the participant.
prime: A string indicating the priming condition the participant was in. “elderly” = elderly prime words , “neutral” = neutral prime words
time: A number indicating how many seconds it took the participant to walk through the hallway in seconds
donate: An integer indicating whether or not the participant was willing to donate to a charity for elderly homeless people. 1 indicates that they did donate, 0 indicates no donation.
favorite.number: An integer indicating the person’s favorite number between 0 and 100. I don’t know why this question was asked…

Data loading and preparation

A. Open your WPA.RProject (the one you created last week) and open a new script. Save the script with the name WPA5.R.

B. Using read.table(), load the tab-delimited text file containing the data into R from http://nathanieldphillips.com/wp-content/uploads/2016/04/priming-5.txt and assign it to a new object called elderly. Make sure to specify that the file is tab-delimited with the argument sep = \t and contains a header with the argument header = T.

elderly <- read.table("http://nathanieldphillips.com/wp-content/uploads/2016/04/priming-5.txt",
                      sep = "\t",
                      header = T
                      )

C. Using write.table(), save the data as a text file called elderly.txt into the data folder in your working directory. That way you’ll always have access to the data even if it’s deleted from the website you downloaded it from.

write.table(elderly, 
            file = "data/elderly.txt", 
            sep = "\t")

Understand the data

D. Look at the first few rows of the dataframe with the head() function to make sure it looks ok.

head(elderly)

##   sex age   prime time donate favorite.number
## 1   f  21 elderly 12.9      0              75
## 2   m  25 elderly 10.7      0              80
## 3   m  21 elderly 10.5      1              71
## 4   m  25 elderly  8.6      0              79
## 5   f  22 elderly 11.5      1              66
## 6   f  22 elderly 10.6      1              69

E. Using the summary() function, look at summary statistics for each column in the dataframe. Make sure everything looks ok.

summary(elderly)

##  sex         age            prime         time           donate    
##  f:45   Min.   :17.00   elderly:50   Min.   : 7.50   Min.   :0.00  
##  m:55   1st Qu.:21.00   neutral:50   1st Qu.: 9.50   1st Qu.:0.00  
##         Median :22.00                Median :10.25   Median :0.00  
##         Mean   :21.98                Mean   :10.21   Mean   :0.48  
##         3rd Qu.:23.00                3rd Qu.:10.90   3rd Qu.:1.00  
##         Max.   :28.00                Max.   :12.90   Max.   :1.00  
##  favorite.number
##  Min.   :56.00  
##  1st Qu.:68.00  
##  Median :71.50  
##  Mean   :71.51  
##  3rd Qu.:75.00  
##  Max.   :86.00

Please write your answers to all hypothesis test questions in proper American Pirate Association (APA) style!

Chi-square: X(df) = XXX, p = YYY

t-test: t(df) = XXX, p = YYY

correlation test: r = XXX, t(df) = YYY, p = ZZZZ

t-test(s)

Was the overall time that people took to walk down the hallway significantly different from 10 seconds? Answer this with a one-sample t-test.

t.test(x = elderly$time,
       mu = 10)

## 
##  One Sample t-test
## 
## data:  elderly$time
## t = 1.9545, df = 99, p-value = 0.05346
## alternative hypothesis: true mean is not equal to 10
## 95 percent confidence interval:
##   9.996886 10.413114
## sample estimates:
## mean of x 
##    10.205

Answer: No, the amount of time people took to walk down the hallway was not significantly different from 0 (t(99) = 1.95, p = .05). That said, this was pretty close to .05 so most researchers would probably say it was significant

Did men and women take a significantly different amount of time to walk down the hallway? Answer this with a two-sample t-test.

t.test(formula = time ~ sex,
       data = elderly)

## 
##  Welch Two Sample t-test
## 
## data:  time by sex
## t = 1.0728, df = 87.57, p-value = 0.2863
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1954849  0.6540707
## sample estimates:
## mean in group f mean in group m 
##        10.33111        10.10182

Answer: No, men and women did not have significantly different walking times (t(87.57) = 1.07, p = .29).

Correlation test(s)

Was there a significant relationship between age and walking time? Answer this with a correlation test.

cor.test(formula = ~ time + age,
       data = elderly)

## 
##  Pearson's product-moment correlation
## 
## data:  time and age
## t = -0.72612, df = 98, p-value = 0.4695
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.2657526  0.1250622
## sample estimates:
##         cor 
## -0.07315292

Answer: No, there was not a significant relationship between age and walking time (r = -0.07, t(98) = -0.76, p = .47).

Was there a significant relationship between age and a person’s favorite number? Answer this with a correlation test.

cor.test(formula = ~ age + favorite.number,
         data = elderly)

## 
##  Pearson's product-moment correlation
## 
## data:  age and favorite.number
## t = 5.1102, df = 98, p-value = 1.59e-06
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.2882498 0.6009704
## sample estimates:
##       cor 
## 0.4586976

Answer: Yes. There was a significant positive correlation between a person’s name and their favorite number (r = 0.46, t(98) = 5.11, p < .01).

chi-square test(s)

Did men and women differ in how likely they were to donate? Answer this with a two-sample chi-square test.

chisq.test(table(elderly$sex, elderly$donate))

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  table(elderly$sex, elderly$donate)
## X-squared = 0.0016188, df = 1, p-value = 0.9679

Answer: No, men and women were equally likely to donate (X(1) = 0.002, p = 0.97)

Where there significantly different numbers of men versus women? Answer this with a one-sample chi-square test.

chisq.test(table(elderly$sex))

## 
##  Chi-squared test for given probabilities
## 
## data:  table(elderly$sex)
## X-squared = 1, df = 1, p-value = 0.3173

Answer: No, there wer not significantly different numbers of men versus women (X(1) = 1.00, p = 0.32)

Goal point!!!!

Drawing

You figure out the test!

Was the average age of the participants significantly different from 100?

t.test(x = elderly$age,
       mu = 100
       )

## 
##  One Sample t-test
## 
## data:  elderly$age
## t = -355.08, df = 99, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 100
## 95 percent confidence interval:
##  21.54402 22.41598
## sample estimates:
## mean of x 
##     21.98

Answer: Yes, the average age of participants was significantly less than 100 (t(99) = -355.01, p < .01)

Were men significantly more or less likely to be assigned to the elderly prime condition compared to women?

chisq.test(table(elderly$sex,
           elderly$prime))

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  table(elderly$sex, elderly$prime)
## X-squared = 0, df = 1, p-value = 1

Answer: No, men and women were both equally likely to be assigned to the elderly prime condition (X(1) = 0, p = 1)

Was there a significant effect of the prime on walking time?

t.test(formula = time ~ prime,
       data = elderly)

## 
##  Welch Two Sample t-test
## 
## data:  time by prime
## t = 4.6412, df = 97.576, p-value = 1.079e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.5071494 1.2648506
## sample estimates:
## mean in group elderly mean in group neutral 
##                10.648                 9.762

Answer: Yes, people given the elderly prime took a significantly longer time to walk down the hallway than those given the neutral prime (t(97.58) = 4.64, p < .01)

Only for participants in the elderly prime condition, was there a relationship between age and walking time?

cor.test(formula = ~ age + time,
         data = elderly,
         subset = prime == "elderly"
         )

## 
##  Pearson's product-moment correlation
## 
## data:  age and time
## t = -1.1908, df = 48, p-value = 0.2396
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.4275787  0.1143481
## sample estimates:
##        cor 
## -0.1693911

Answer: No, people in the elderly condition did not show a significant relationship between age and walking time (t(48) = -1.19, p = 0.24)

Only for men younger than 22, was there a significant effect of the prime on the likelihood of donating?

elderly.2 <- subset(elderly, sex == "m" & age < 22)

chisq.test(table(elderly.2$prime, elderly.2$donate))

## Warning in chisq.test(table(elderly.2$prime, elderly.2$donate)): Chi-
## squared approximation may be incorrect

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  table(elderly.2$prime, elderly.2$donate)
## X-squared = 0, df = 1, p-value = 1

Answer: No, men who were younger than 22 did not show a significant effect of prime on the likelihood of donating (X(1) = 1, p = 1). Note that we received an error from R because the amount of data we tested was so small!

Only for women who made a donation, was there a significant effect of the prime on time?

t.test(formula = time ~ prime,
       data = elderly,
       subset = sex == "f" & donate == 1
       )

## 
##  Welch Two Sample t-test
## 
## data:  time by prime
## t = 3.2929, df = 16.849, p-value = 0.004337
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.3997131 1.8280647
## sample estimates:
## mean in group elderly mean in group neutral 
##             10.891667              9.777778

Answer: Yes, of the women who made a donation, those given an elderly prime took significantly longer to walk down the hallway compared to those given the neutral prime (t(16.85) = 3.29, p < .01

More fun

In words, interpret the p-value that you got in Question 11.

Answer: Assuming the null hypothesis is correct – that men younger than 22 are equally likely to donate no matter which prime they get – the probability of getting a test statistic as or more extreme than the one we got was 1.00

Add a new column to the elderly dataframe called y.p that has a correlation of +1 with age. Confirm that the correlation is +1 using cor.test()

# There are infinitely many ways to do this
# Here are a few

elderly$y.p <- elderly$age * 2
elderly$y.p <- elderly$age * 2 - 50
elderly$y.p <- elderly$age / 10
# ...

cor.test(formula = ~ y.p + age, data = elderly)

## 
##  Pearson's product-moment correlation
## 
## data:  y.p and age
## t = Inf, df = 98, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  1 1
## sample estimates:
## cor 
##   1

Add a new column to the elderly dataframe called y.n that has a correlation of -1 with favorite.number. Confirm that the correlation is -1 using cor.test()

# Again, there are an infinite number of ways to do this

elderly$y.n <- -2 * elderly$age
elderly$y.n <- -10 * elderly$age - 22
# ...

cor.test(formula = ~ y.n + age, 
         data = elderly)

## 
##  Pearson's product-moment correlation
## 
## data:  y.n and age
## t = NaN, df = 98, p-value = NA
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  NaN NaN
## sample estimates:
## cor 
##  -1

The rnorm(n, mean, sd) function allows you to generate random data from a normal distribution with a specified mean and standard deviation. You can use the rnorm() function to add random noise to an existing variable. For example, the following code will generate (and plot), two variables y.1 and y.2 that are correlated with an independent variable x. y.1 is slightly positively correlated with x, and y.2 is highly negatively correlated with x.

# Create x, our independent variable
x <- rnorm(n = 100, mean = 10, sd = 1)

# Create y.1, slightly positively correlated with x

y.1 <- x + rnorm(n = 100, mean = 5, sd = 4)

# Create y.2, highly negatively correlated with x

y.2 <- -1 * x + rnorm(n = 100, mean = 13, sd = .3)

# Plot data

plot(1, 
     xlim = c(5, 15), 
     ylim = c(0, 25),
     ylab = "y",
     xlab = "x",
     type = "n")

points(x, y.1, pch = 21, bg = "red", col = "white")

points(x, y.2, pch = 21, bg = "blue", col = "white")


# Add labels

text(6, 15, paste("cor(x, y.1) = ", round(cor(x, y.1), 2), sep = ""))

text(6, 2.5, paste("cor(x, y.2) = ", round(cor(x, y.2), 2), sep = ""))

# Add legend

legend("topright", 
       legend = c("y.1", "y.2"), 
       pch = 16, 
       col = c("red", "blue"), 
       bty = "n")

Using the rnorm() function, add a new column to the elderly dataframe that has a correlation somewhere between 0.30 and 0.60 with age.

Something for you to do if you got this far…

Create the following plot
- Set up the plotting space and plot the data for one group using plot()
- Add the points for the other group with points()
- Add a legend with legend()
- Add horizontal lines at the group means with abline()
- Add the Mean = XX text with text() combined with the paste() function
- Conduct the appropriate test using t.test() and save the result as a new object.
- Add the APA conclusion with text() combined with the apa() function applied to your test object.

library(yarrr)

# Create elderly and neutral data frames

elderly.elderly <- subset(elderly, prime == "elderly")
elderly.neutral <- subset(elderly, prime == "neutral")

# Create the plot with elderly data
plot(x = elderly.elderly$age, 
    y = elderly.elderly$time, 
    xlim = c(10, 30),
    ylim = c(5, 15),
    pch = 16, 
    col = transparent("red", .8),
    cex = 1,
    xlab = "Age",
    ylab = "Time",
    main = "Age and Walking Times\nElderly vs. Neutral Prime"
    )

# Add mean line for elderly data
abline(h = mean(elderly.elderly$time), 
       col = transparent("red", .8), lwd = 5)

# Add text above mean line
text(12, mean(elderly.elderly$time), 
     labels = paste("Mean = ", round(mean(elderly.elderly$time), 2), sep = ""),
     pos = 3
     )

# Add points for neutral prime data
points(x = elderly.neutral$age, 
    y = elderly.neutral$time, 
    pch = 16, 
    col = transparent("blue", .8),
    cex = 1
    )

# Add mean line
abline(h = mean(elderly.neutral$time), 
       col = transparent("blue", .8), lwd = 5)

# Add text
text(12, mean(elderly.neutral$time), 
     labels = paste("Mean = ", round(mean(elderly.neutral$time), 2), sep = ""),
     pos = 1)

# Run t.test
my.test <- t.test(formula = time ~ prime, data = elderly)

# Add text of apa format of t.test
text(10, 6, labels = apa(my.test), cex = .7, adj = 0)

# Add legend
legend("topleft",
       legend = c("Prime = Elderly", "Prime = Neutral"),
       pch = c(16, 16), 
       col = c(transparent("red", .8), transparent("blue", .8)),
       bty = "n"
       )

WPA #6: 1 and 2 sample Hypothesis Tests

Basel Spring 2016

Datafile description

Data loading and preparation

Understand the data

Please write your answers to all hypothesis test questions in proper American Pirate Association (APA) style!

t-test(s)

Correlation test(s)

chi-square test(s)

Goal point!!!!

You figure out the test!

More fun

Something for you to do if you got this far…

WPA #6: 1 and 2 sample Hypothesis Tests

Basel Spring 2016

Automaticity of Social Behavior – Does priming an elderly stereotype change walking speed?

Datafile description

Data loading and preparation

Understand the data

Please write your answers to all hypothesis test questions in proper American Pirate Association (APA) style!

t-test(s)

Correlation test(s)

chi-square test(s)

Goal point!!!!

You figure out the test!

More fun

Something for you to do if you got this far…