In this WPA, you will analyze data from a study on attraction. In the study, 1000 heterosexual University students viewed the Facebook profile of another student (the “target”) of the opposite sex. Based on a target’s profile, each participant made three judgments about the target - intelligence, attractiveness, and dateability. The primary judgement was a dateability rating indicating how dateable the person was on a scale of 0 to 100.
The data are located in a tab-delimited text file at http://nathanieldphillips.com/wp-content/uploads/2016/04/facebook.txt
Here is how the first few rows of the data should look:
## session sex age haircolor university education shirtless intelligence
## 1 1 m 23 brown 3.Geneva 3.Masters 2.Yes 1.low
## 2 1 m 19 blonde 2.Zurich 1.HighSchool 1.No 2.medium
## 3 1 f 22 brown 2.Zurich 2.Bachelors 2.Yes 1.low
## 4 1 f 22 red 2.Zurich 2.Bachelors 1.No 2.medium
## 5 1 m 23 brown 3.Geneva 2.Bachelors 1.No 2.medium
## 6 1 m 26 blonde 2.Zurich 3.Masters 2.Yes 3.high
## attractiveness dateability
## 1 3.high 15
## 2 2.medium 44
## 3 2.medium 100
## 4 3.high 100
## 5 2.medium 63
## 6 3.high 76
The data file has 1000 rows and 10 columns. Here are the columns
session: The experiment session in which the study was run. There were 50 total sessions.
sex: The sex of the target
age: The age of the target
haircolor: The haircolor of the target
university: The university that the target attended.
education: The highest level of education obtained by the target.
shirtless: Did the target have a shirtless profile picture? 1.No v 2.Yes
intelligence: How intelligent do you find this target? 1.Low, 2.Medium, 3.High
attractiveness: How physically attractive do you find this target? 1.Low, 2.Medium, 3.High
dateability: How dateable is this target? 0 to 100.
A. Open your WPA.RProject and open a new script. Save the script with the name WPA7.R.
B. Using read.table(), load the tab-delimited text file containing the data into R from http://nathanieldphillips.com/wp-content/uploads/2016/04/facebook.txt and assign it to a new object called facebook. Make sure to specify that the file is tab-delimited with the argument sep = \t and contains a header with the argument header = T.
C. Using write.table(), save the data as a text file called facebook.txt into the data folder in your working directory. That way you’ll always have access to the data even if it’s deleted from the website you downloaded it from.
D. Look at the first few rows of the dataframe with the head() function to make sure it looks ok.
E. Using the summary() function, look at summary statistics for each column in the dataframe. Make sure everything looks ok.
For each question, conduct the appropriate ANOVA. Write the conclusion in APA style. To summarize an effect in an ANOVA, use the format F(XXX, YYY) = FFF, p = PPP, where XXX is the degrees of freedom of the variable you are testing, YYY is the degrees of freedom of the residuals, FFF is the F value for the variable you are testing, and PPP is the p-value. If the p-value is less than .01, just write p < .01.
If the p-value of the ANOVA is less than .05, conduct post-hoc tests.
For example, here is how I would analyze and answer the question: “Was there an effect of diets on Chicken Weights?”"
Answer: There was a significant main effect of diets on chicken weights (F(3, 574) = 10.81, p < .01). Pairwise Tukey HSD tests showed significant differences between diets 1 and 3 (diff = 40.30, p < .01) and diets 1 and 4 (diff = 32.62, p < .01). All other pairwise differences were not significant at the 0.05 significance threshold.
Was there a main effect of the university on dateability? Conduct a one-way ANOVA. If the result is significant (p < .05), conduct post-hoc tests
Was there a main effect of intelligence on dateability? Conduct a one-way ANOVA. If the result is significant (p < .05), conduct post-hoc tests
Was there a main effect of haircolor on dateability? Conduct a one-way ANOVA. If the result is significant (p < .05), conduct post-hoc tests
Conduct a two-way ANOVA on dateability with both intelligence and university as IVs
Conduct a multi-way anova including ALL independent variables predicting dateability. Conduct post-hoc tests on the conditions that differ.
Add a new column to the dataframe called all.aov.dateability that has the predicted dateability for each person according to the multi-way ANOVA you just ran
Create a scatterplot showing the relationship between the actual dateability and predicted dateability. Add appropriate labels to the plot
Create a plot (e.g.; pirateplot(), barplot(), boxplot()) showing the distribution of dateability based on two independent variables: sex and shirtless
Based on what you see in the plot, do you expect there to be an interaction between sex and shirtless? Why or why not?
Test your prediction with the appropriate ANOVA
Conduct a one-way ANOVA on the effect of attractiveness on dateability:
Add the fitted values from your previous ANOVA back to the dataframe as a new vector called attractiveness.aov.dateability
Round attractiveness.aov.dateability to the nearest 3rd decimal place using the round() function. (Hint: Assign the variable in the dataframe to a rounded version of itself).
Look at all the unique values of attractiveness.aov.dateability with table(). How many different values does the ANOVA predict?
Calculate the actual mean dateability for each level of attractiveness with aggregate() or dplyr()
Based on what you’ve found, how does an ANOVA fit specific values to data? In other words, if you conduct a one-way ANOVA, and make predictions for groups based on that model, what will the ANOVA predict for each observation in each group?
Let’s study the relationship between university and attractiveness on dateability.
Conduct a one-way ANOVA on dateability with attractiveness as the IV (I know you just did it, but do it again)
Conduct a one-way ANOVA on dateability with university as the IV
Conduct a single multi-way ANVOA with both variables (use formula = attractiveness + university). What is your conclusion?
Did something change? If so, explain your findings!
Conduct a multi-way ANOVA on dateability with sex and education as independent variables. No matter if they are significant or not, conduct post-hoc tests.
Repeat your analysis using regression instead of ANOVA to get regression coefficients.
What are the default values of sex and education in your regression analysis?
What dateability does the regression predict for a female with a high-school education?
What dateability does the regression predict for a male with a PhD?
Are the significance levels for group differences the same in your ANOVA post-hoc tests and your regression analysis?
Create a plot (e.g.; pirateplot(), barplot(), boxplot()) showing the distribution of dateability based on two independent variables: sex and shirtless
Based on what you see in the plot, do you expect there to be an interaction between sex and shirtless? Why or why not?
Test your prediction with the appropriate ANOVA
Create a plot (e.g.; pirateplot(), barplot(), boxplot()) showing the distribution of dateability based on two independent variables: university and haircolor
Based on what you see in the plot, do you expect there to be an interaction between university and intelligence? Why or why not?
Test your prediction with the appropriate ANOVA
Here are data for 3 of your friends:
| sex | age | university | intelligence | shirtless | attractiveness |
|---|---|---|---|---|---|
| m | 22 | 1.Basel | 3.high | 1.No | 3.high |
| f | 23 | 1.Basel | 1.low | 2.Yes | 3.high |
| m | 26 | 2.Zurich | 1.low | 2.Yes | 1.low |