In this WPA, you will analyze data from a study on attraction. In the study, 1000 heterosexual University students viewed the Facebook profile of another student (the “target”) of the opposite sex. Based on a target’s profile, each participant made three judgments about the target - intelligence, attractiveness, and dateability. The primary judgement was a dateability rating indicating how dateable the person was on a scale of 0 to 100.
The data are located in a tab-delimited text file at http://nathanieldphillips.com/wp-content/uploads/2016/04/facebook.txt
Here is how the first few rows of the data should look:
head(facebook)
## session sex age haircolor university education shirtless intelligence
## 1 1 m 23 brown 3.Geneva 3.Masters 2.Yes 1.low
## 2 1 m 19 blonde 2.Zurich 1.HighSchool 1.No 2.medium
## 3 1 f 22 brown 2.Zurich 2.Bachelors 2.Yes 1.low
## 4 1 f 22 red 2.Zurich 2.Bachelors 1.No 2.medium
## 5 1 m 23 brown 3.Geneva 2.Bachelors 1.No 2.medium
## 6 1 m 26 blonde 2.Zurich 3.Masters 2.Yes 3.high
## attractiveness dateability
## 1 3.high 15
## 2 2.medium 44
## 3 2.medium 100
## 4 3.high 100
## 5 2.medium 63
## 6 3.high 76
The data file has 1000 rows and 10 columns. Here are the columns
session: The experiment session in which the study was run. There were 50 total sessions.
sex: The sex of the target
age: The age of the target
haircolor: The haircolor of the target
university: The university that the target attended.
education: The highest level of education obtained by the target.
shirtless: Did the target have a shirtless profile picture? 1.No v 2.Yes
intelligence: How intelligent do you find this target? 1.Low, 2.Medium, 3.High
attractiveness: How physically attractive do you find this target? 1.Low, 2.Medium, 3.High
dateability: How dateable is this target? 0 to 100.
A. Open your WPA.RProject and open a new script. Save the script with the name WPA7.R.
B. Using read.table(), load the tab-delimited text file containing the data into R from http://nathanieldphillips.com/wp-content/uploads/2016/04/facebook.txt and assign it to a new object called facebook. Make sure to specify that the file is tab-delimited with the argument sep = \t and contains a header with the argument header = T.
C. Using write.table(), save the data as a text file called facebook.txt into the data folder in your working directory. That way you’ll always have access to the data even if it’s deleted from the website you downloaded it from.
facebook <- read.table("http://nathanieldphillips.com/wp-content/uploads/2016/04/facebook.txt",
sep = "\t",
header = T
)
D. Look at the first few rows of the dataframe with the head() function to make sure it looks ok.
head(facebook)
## session sex age haircolor university education shirtless intelligence
## 1 1 m 23 brown 3.Geneva 3.Masters 2.Yes 1.low
## 2 1 m 19 blonde 2.Zurich 1.HighSchool 1.No 2.medium
## 3 1 f 22 brown 2.Zurich 2.Bachelors 2.Yes 1.low
## 4 1 f 22 red 2.Zurich 2.Bachelors 1.No 2.medium
## 5 1 m 23 brown 3.Geneva 2.Bachelors 1.No 2.medium
## 6 1 m 26 blonde 2.Zurich 3.Masters 2.Yes 3.high
## attractiveness dateability
## 1 3.high 15
## 2 2.medium 44
## 3 2.medium 100
## 4 3.high 100
## 5 2.medium 63
## 6 3.high 76
E. Using the summary() function, look at summary statistics for each column in the dataframe. Make sure everything looks ok.
summary(facebook)
## session sex age haircolor university
## Min. : 1.0 f:479 Min. :17.00 blonde:250 1.Basel :310
## 1st Qu.:13.0 m:521 1st Qu.:21.00 brown :500 2.Zurich:363
## Median :25.5 Median :22.00 red :250 3.Geneva:327
## Mean :25.5 Mean :22.33
## 3rd Qu.:38.0 3rd Qu.:23.00
## Max. :50.0 Max. :30.00
## education shirtless intelligence attractiveness
## 1.HighSchool:220 1.No :756 1.low :309 1.low :230
## 2.Bachelors :497 2.Yes:244 2.medium:348 2.medium:317
## 3.Masters :189 3.high :343 3.high :453
## 4.PhD : 94
##
##
## dateability
## Min. : 0.0
## 1st Qu.: 35.0
## Median : 53.0
## Mean : 54.2
## 3rd Qu.: 74.0
## Max. :100.0
For each question, conduct the appropriate ANOVA. Write the conclusion in APA style. To summarize an effect in an ANOVA, use the format F(XXX, YYY) = FFF, p = PPP, where XXX is the degrees of freedom of the variable you are testing, YYY is the degrees of freedom of the residuals, FFF is the F value for the variable you are testing, and PPP is the p-value. If the p-value is less than .01, just write p < .01.
If the p-value of the ANOVA is less than .05, conduct post-hoc tests.
For example, here is how I would analyze and answer the question: “Was there an effect of diets on Chicken Weights?”"
# ANOVA on Chicken Weights
# IV = Diet, DV = weight
# ANOVA
summary(aov(formula = weight ~ Diet,
data = ChickWeight
))
## Df Sum Sq Mean Sq F value Pr(>F)
## Diet 3 155863 51954 10.81 6.43e-07 ***
## Residuals 574 2758693 4806
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# ANOVA was significant, so I'll conduct post-hoc tests
# Tukey post-hoc tests
TukeyHSD(aov(formula = weight ~ Diet,
data = ChickWeight
))
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = weight ~ Diet, data = ChickWeight)
##
## $Diet
## diff lwr upr p adj
## 2-1 19.971212 -0.2998092 40.24223 0.0552271
## 3-1 40.304545 20.0335241 60.57557 0.0000025
## 4-1 32.617257 12.2353820 52.99913 0.0002501
## 3-2 20.333333 -2.7268370 43.39350 0.1058474
## 4-2 12.646045 -10.5116315 35.80372 0.4954239
## 4-3 -7.687288 -30.8449649 15.47039 0.8277810
Answer: There was a significant main effect of diets on chicken weights (F(3, 574) = 10.81, p < .01). Pairwise Tukey HSD tests showed significant differences between diets 1 and 3 (diff = 40.30, p < .01) and diets 1 and 4 (diff = 32.62, p < .01). All other pairwise differences were not significant at the 0.05 significance threshold.
uni.aov <- aov(formula = dateability ~ university,
data = facebook
)
summary(uni.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## university 2 19769 9884 13.99 1.01e-06 ***
## Residuals 997 704173 706
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# It is significant
TukeyHSD(uni.aov)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dateability ~ university, data = facebook)
##
## $university
## diff lwr upr p adj
## 2.Zurich-1.Basel -10.275500 -15.099617 -5.451383 0.0000020
## 3.Geneva-1.Basel -8.641689 -13.586611 -3.696767 0.0001311
## 3.Geneva-2.Zurich 1.633811 -3.122189 6.389811 0.6991862
Answer: There was a significant main effect of university on dateability (F(2, 997) = 13.99, p < .01). There was a significant difference between Zurich and Basel (diff = -10, p < .01) and between Geneva and basel (diff = -8.64, p < .01). The difference between Geneva and Zurich was non-significant (diff - 1.63, p = 0.70)
intelligence.aov <- aov(formula = dateability ~ intelligence,
data = facebook)
summary(intelligence.aov) # Significant!
## Df Sum Sq Mean Sq F value Pr(>F)
## intelligence 2 7070 3535 4.917 0.0075 **
## Residuals 997 716871 719
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(intelligence.aov)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dateability ~ intelligence, data = facebook)
##
## $intelligence
## diff lwr upr p adj
## 2.medium-1.low 6.535264 1.615557 11.454970 0.0053173
## 3.high-1.low 2.865833 -2.070710 7.802375 0.3610523
## 3.high-2.medium -3.669431 -8.458242 1.119380 0.1705679
Answer: There was a significant main effect of intelligence on dateability (F(2, 997) = 4.92, p < .01). There was a significant difference between medium and low intelligence (diff = -6.5, p < .01). The difference between high and low intelligence was non-significant (diff = 2.87, p = 0.36) as was the difference between high and medium intelligence (diff = -3.67, p = 0.17)
hair.aov <- aov(formula = dateability ~ haircolor,
data = facebook
)
summary(hair.aov) # Non-significant
## Df Sum Sq Mean Sq F value Pr(>F)
## haircolor 2 1779 889.7 1.228 0.293
## Residuals 997 722162 724.3
Answer: There was a no significant main effect of haircolor on dateability (F(2, 997) = 1.23, p = 0.29).
int.uni.aov <- aov(formula = dateability ~ intelligence + university,
data = facebook
)
summary(int.uni.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## intelligence 2 7070 3535 5.039 0.00665 **
## university 2 18838 9419 13.426 1.76e-06 ***
## Residuals 995 698033 702
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(int.uni.aov)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dateability ~ intelligence + university, data = facebook)
##
## $intelligence
## diff lwr upr p adj
## 2.medium-1.low 6.535264 1.675738 11.394790 0.0046796
## 3.high-1.low 2.865833 -2.010323 7.741988 0.3520433
## 3.high-2.medium -3.669431 -8.399663 1.060800 0.1632717
##
## $university
## diff lwr upr p adj
## 2.Zurich-1.Basel -9.988136 -14.796014 -5.180258 0.0000038
## 3.Geneva-1.Basel -8.490653 -13.418930 -3.562376 0.0001673
## 3.Geneva-2.Zurich 1.497483 -3.242507 6.237473 0.7388000
Do your results for each variable change compared to your previous one-way ANOVAs?
Answer: No, the results are pretty much the same
big.aov <- aov(formula = dateability ~ .,
data = facebook)
summary(big.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## session 1 119 119 0.215 0.64268
## sex 1 36745 36745 66.443 1.09e-15 ***
## age 1 1413 1413 2.556 0.11023
## haircolor 2 1849 924 1.671 0.18852
## university 2 19430 9715 17.567 3.19e-08 ***
## education 3 2052 684 1.237 0.29510
## shirtless 1 108 108 0.195 0.65907
## intelligence 2 5922 2961 5.355 0.00487 **
## attractiveness 2 112121 56061 101.370 < 2e-16 ***
## Residuals 984 544182 553
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Problem! We can't conduct a Tukey HSD test on non-categorical independent variables! (Specifically, session and age).
# I will repeat the ANOVA without these variables, then conduct post-hoc tests:
big2.aov <- aov(formula = dateability ~ sex + haircolor + university + education + shirtless + intelligence + attractiveness,
data = facebook)
summary(big2.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## sex 1 36514 36514 66.103 1.28e-15 ***
## haircolor 2 1869 935 1.692 0.18466
## university 2 19172 9586 17.354 3.92e-08 ***
## education 3 3186 1062 1.922 0.12424
## shirtless 1 106 106 0.192 0.66106
## intelligence 2 5904 2952 5.345 0.00491 **
## attractiveness 2 112544 56272 101.872 < 2e-16 ***
## Residuals 986 544646 552
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(big2.aov)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dateability ~ sex + haircolor + university + education + shirtless + intelligence + attractiveness, data = facebook)
##
## $sex
## diff lwr upr p adj
## m-f -12.09605 -15.01559 -9.17652 0
##
## $haircolor
## diff lwr upr p adj
## brown-blonde 0.5017632 -3.771443 4.77497 0.9589918
## red-blonde 3.4562264 -1.478048 8.39050 0.2276581
## red-brown 2.9544631 -1.318744 7.22767 0.2364246
##
## $university
## diff lwr upr p adj
## 2.Zurich-1.Basel -9.8015495 -14.067852 -5.535247 0.0000003
## 3.Geneva-1.Basel -9.0414117 -13.414550 -4.668273 0.0000042
## 3.Geneva-2.Zurich 0.7601379 -3.445924 4.966199 0.9055940
##
## $education
## diff lwr upr p adj
## 2.Bachelors-1.HighSchool 3.2829164 -1.6148596 8.180692 0.3111948
## 3.Masters-1.HighSchool 5.4706664 -0.5279113 11.469244 0.0883508
## 4.PhD-1.HighSchool 2.8229708 -4.6298115 10.275753 0.7638807
## 3.Masters-2.Bachelors 2.1877500 -2.9809560 7.356456 0.6962122
## 4.PhD-2.Bachelors -0.4599456 -7.2626327 6.342742 0.9981280
## 4.PhD-3.Masters -2.6476956 -10.2812581 4.985867 0.8087188
##
## $shirtless
## diff lwr upr p adj
## 2.Yes-1.No 0.7529005 -2.642917 4.148718 0.6635954
##
## $intelligence
## diff lwr upr p adj
## 2.medium-1.low 5.917902 1.605769 10.2300343 0.0037664
## 3.high-1.low 2.356941 -1.969949 6.6838298 0.4076096
## 3.high-2.medium -3.560961 -7.758364 0.6364413 0.1148470
##
## $attractiveness
## diff lwr upr p adj
## 2.medium-1.low 12.98680 8.208449 17.76516 0
## 3.high-1.low 24.54165 20.075066 29.00824 0
## 3.high-2.medium 11.55485 7.515186 15.59451 0
Answer: I found significant effects for sex (F(1, 986) = 66.10, p < .01), university (F(2, 986) = 17.35, p < .01), intelligence (F(2, 986) = 5.35, p < .01), and attractiveness (F(2, 986) = 101.87, p < .01). All other variables were non-significant at the .05 threshold. For the university variable, post-hoc tests revealed significant differences between Zurich and Basel (diff = -9.80, p < .01) and Geneva and Basel (diff = -9.04, p < .01). For the intelligence variable, post-hoc tests revealed significant differences between medium and low intelligence (dif = 5.92, p < .01). For the attractiveness variable, post-hoc tests showed significant differences between all group pairs (all ps < .01).
Add a new column to the dataframe called all.aov.dateability that has the predicted dateability for each person according to the multi-way ANOVA you just ran
facebook$all.aov.dateability <- big2.aov$fitted.values
Create a scatterplot showing the relationship between the actual dateability and predicted dateability. Add appropriate labels to the plot
plot(x = facebook$dateability,
y = facebook$all.aov.dateability,
main = "Predicted vs. Actual Dateability",
xlab = "Actual",
ylab = "Model Predictions"
)
big2.b.aov <- aov(formula = dateability ~ sex + haircolor + university + education + shirtless + intelligence + attractiveness,
data = subset(facebook, session >= 31 & session <= 50))
summary(big2.b.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## sex 1 6933 6933 12.425 0.000474 ***
## haircolor 2 2459 1230 2.203 0.111806
## university 2 12099 6050 10.842 2.62e-05 ***
## education 3 2059 686 1.230 0.298479
## shirtless 1 184 184 0.330 0.565884
## intelligence 2 543 271 0.487 0.615135
## attractiveness 2 36084 18042 32.334 1.04e-13 ***
## Residuals 386 215386 558
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Answer: When I only include sessions 31 through 50, the main effect of intelligence is no longer significant (F(2, 386) = 0.49, p = 0.62). All other variables are qualitatively the same
library(yarrr)
pirateplot(dateability ~ sex + shirtless,
data = facebook)
Based on what you see in the plot, do you expect there to be an interaction between sex and shirtless? Why or why not?
Answer: Yes, the effect of shirtless appears to differ between the two sexes
Test your prediction with the appropriate ANOVA
sex.shirt.int.aov <- aov(formula = dateability ~ sex * shirtless,
data = facebook
)
summary(sex.shirt.int.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## sex 1 36514 36514 62.689 6.43e-15 ***
## shirtless 1 247 247 0.425 0.515
## sex:shirtless 1 107048 107048 183.786 < 2e-16 ***
## Residuals 996 580132 582
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Answer: My prediction was correct! The interaction is significant (F(1, 996) = 183.79, p < .01)
att.aov <- aov(formula = dateability ~ attractiveness,
data = facebook
)
summary(att.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## attractiveness 2 133708 66854 112.9 <2e-16 ***
## Residuals 997 590234 592
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Add the fitted values from your previous ANOVA back to the dataframe as a new vector called attractiveness.aov.dateability
facebook$attractiveness.aov.dateability <- att.aov$fitted.values
Round attractiveness.aov.dateability to the nearest 3rd decimal place using the round() function. (Hint: Assign the variable in the dataframe to a rounded version of itself).
facebook$attractiveness.aov.dateability <- round(facebook$attractiveness.aov.dateability, 3)
Look at all the unique values of attractiveness.aov.dateability with table(). How many different values does the ANOVA predict?
table(facebook$attractiveness.aov.dateability)
##
## 36.896 50.303 65.711
## 230 317 453
Answer: There are only three different predictions….
Calculate the actual mean dateability for each level of attractiveness with aggregate() or dplyr()
aggregate(dateability ~ attractiveness,
FUN = mean,
data = facebook)
## attractiveness dateability
## 1 1.low 36.89565
## 2 2.medium 50.30284
## 3 3.high 65.71082
Based on what you’ve found, how does an ANOVA fit specific values to data? In other words, if you conduct a one-way ANOVA, and make predictions for groups based on that model, what will the ANOVA predict for each observation in each group?
Answer: ANOVA just predicts the group mean for each group member!
Conduct a one-way ANOVA on dateability with attractiveness as the IV (I know you just did it, but do it again)
att.aov <- aov(formula = dateability ~ attractiveness,
data = facebook
)
summary(att.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## attractiveness 2 133708 66854 112.9 <2e-16 ***
## Residuals 997 590234 592
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Answer: There appears to be a significant effect of attractiveness on dateability (F(2, 997) = 112.9, p < .01)
Conduct a one-way ANOVA on dateability with university as the IV
uni.aov <- aov(formula = dateability ~ university,
data = facebook
)
summary(uni.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## university 2 19769 9884 13.99 1.01e-06 ***
## Residuals 997 704173 706
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Answer: There appears to be a significant effect of university on dateability (F(2, 997) = 13.99, p < .01)
Conduct a single multi-way ANVOA with both variables (use formula = attractiveness + university). What is your conclusion?
att.uni.aov <- aov(formula = dateability ~ attractiveness + university,
data = facebook
)
summary(att.uni.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## attractiveness 2 133708 66854 112.739 <2e-16 ***
## university 2 203 101 0.171 0.843
## Residuals 995 590031 593
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Did something change? If so, explain your findings!
Answer: Yes! The effect of university is not significant when we include attractiveness in the model. The problem is that there is an imbalance between attractiveness and university. Specifically, students from basel are generally more attractive than students from other universities. Therefore, when we only include university in the model, it looks like there is an effect of university. However, the true effect is only on attractiveness!
# Show the relationship between university and attractiveness
with(facebook, table(university,
attractiveness))
## attractiveness
## university 1.low 2.medium 3.high
## 1.Basel 34 29 247
## 2.Zurich 128 114 121
## 3.Geneva 68 174 85
sex.ed.aov <- aov(dateability ~ sex + education,
data = facebook)
summary(sex.ed.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## sex 1 36514 36514 53.07 6.55e-13 ***
## education 3 2807 936 1.36 0.254
## Residuals 995 684620 688
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(sex.ed.aov)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dateability ~ sex + education, data = facebook)
##
## $sex
## diff lwr upr p adj
## m-f -12.09605 -15.35445 -8.837661 0
##
## $education
## diff lwr upr p adj
## 2.Bachelors-1.HighSchool 2.5729442 -2.893274 8.039163 0.6197997
## 3.Masters-1.HighSchool 5.2384035 -1.456377 11.933184 0.1835943
## 4.PhD-1.HighSchool 2.1344785 -6.183284 10.452241 0.9118887
## 3.Masters-2.Bachelors 2.6654593 -3.103134 8.434052 0.6339136
## 4.PhD-2.Bachelors -0.4384657 -8.030682 7.153750 0.9988299
## 4.PhD-3.Masters -3.1039250 -11.623449 5.415599 0.7846274
Repeat your analysis using regression instead of ANOVA to get regression coefficients.
summary(lm(dateability ~ sex + education,
data = facebook
))
##
## Call:
## lm(formula = dateability ~ sex + education, data = facebook)
##
## Residuals:
## Min 1Q Median 3Q Max
## -60.58 -19.58 -0.36 20.75 54.05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 58.011 1.967 29.495 < 2e-16 ***
## sexm -12.056 1.661 -7.258 7.9e-13 ***
## education2.Bachelors 2.572 2.124 1.211 0.2262
## education3.Masters 5.239 2.602 2.014 0.0443 *
## education4.PhD 2.135 3.232 0.660 0.5091
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 26.23 on 995 degrees of freedom
## Multiple R-squared: 0.05432, Adjusted R-squared: 0.05051
## F-statistic: 14.29 on 4 and 995 DF, p-value: 2.406e-11
What are the default values of sex and education in your regression analysis?
Answer: The default for sex is female, for education it is 1.High School
What dateability does the regression predict for a female with a high-school education?
Answer: 58.011 (the intercept)
What dateability does the regression predict for a male with a PhD?
Answer: 58.011 - 12.056 + 2.135 = 72.202
Are the significance levels for group differences the same in your ANOVA post-hoc tests and your regression analysis?
Answer: Not quite. In the Tukey test, we found no significant post-hoc differences between education groups. In the regression analysis, there is a significant difference between Masters and High School.
library(yarrr)
pirateplot(dateability ~ sex + shirtless,
data = facebook
)
Based on what you see in the plot, do you expect there to be an interaction between sex and shirtless? Why or why not?
Answer: Yes, there appears to be a strong interaction. For non-shirtless people, gender does not appear to make a difference. For shirtless people, women are strongly preferred to men!
Test your prediction with the appropriate ANOVA
summary(aov(dateability ~ sex * shirtless, data = facebook))
## Df Sum Sq Mean Sq F value Pr(>F)
## sex 1 36514 36514 62.689 6.43e-15 ***
## shirtless 1 247 247 0.425 0.515
## sex:shirtless 1 107048 107048 183.786 < 2e-16 ***
## Residuals 996 580132 582
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Answer: Our prediction is confirmed! There is a significant interaction (F(1, 996) = 183.79, p < 0.01)
pirateplot(dateability ~ university + haircolor,
data = facebook
)
Based on what you see in the plot, do you expect there to be an interaction between university and intelligence? Why or why not?
Answer: No, there does not appear to be an interaction. The effect of school on dateability appears the same in all haircolor conditions
Test your prediction with the appropriate ANOVA
summary(aov(dateability ~ university * haircolor,
data = facebook))
## Df Sum Sq Mean Sq F value Pr(>F)
## university 2 19769 9884 13.967 1.04e-06 ***
## haircolor 2 1647 823 1.163 0.313
## university:haircolor 4 1217 304 0.430 0.787
## Residuals 991 701309 708
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Answer: Confirmed. We do not find a significant interaction between haircolor and university (F(4, 991) = 0.43, p = 0.79)
Here are data for 3 of your friends:
new.data <- data.frame(sex = c("m", "f", "m"),
age = c(22, 23, 26),
university = c("1.Basel", "1.Basel", "2.Zurich"),
intelligence = c("3.high", "1.low", "1.low"),
shirtless = c("1.No", "2.Yes", "2.Yes"),
attractiveness = c("3.high", "3.high", "1.low"),
stringsAsFactors = F
)
library(knitr)
kable(new.data, format = "markdown")
| sex | age | university | intelligence | shirtless | attractiveness |
|---|---|---|---|---|---|
| m | 22 | 1.Basel | 3.high | 1.No | 3.high |
| f | 23 | 1.Basel | 1.low | 2.Yes | 3.high |
| m | 26 | 2.Zurich | 1.low | 2.Yes | 1.low |
new.aov <- aov(formula = dateability ~ sex + age + university + intelligence + shirtless + attractiveness,
data = facebook)
new.data <- data.frame(sex = c("m", "f", "m"),
age = c(22, 23, 26),
university = c("1.Basel", "1.Basel", "2.Zurich"),
intelligence = c("3.high", "1.low", "1.low"),
shirtless = c("1.No", "2.Yes", "2.Yes"),
attractiveness = c("3.high", "3.high", "1.low"),
stringsAsFactors = F
)
predict(new.aov,
new.data)
## 1 2 3
## 58.69826 69.01925 30.39767
Answer: The model predicts that my friends will have dateability ratings of 58.70, 69.02 and 30.40 respectively