Differential Prediction
For this assignment, you will need to either install or load in the following packages:
Part 1:
Download the data set “HW2Q1data19.csv” from Canvas. The SPSS file contains the data for three variables: gender (0 = male, 1 = female), an Agreeableness score, and happiness. Conduct a differential prediction analysis to determine whether the Agreeableness measure predicts Happiness differently for men and women. In other words, does the Agreeableness measure show relational bias? Follow the flow chart provided by Mendoza and Lautenschlager (1986) to perform the appropriate sequence of tests. Is there evidence of differential prediction, and if so, does it stem from slope differences, intercept differences, or both? Explain and justify your answer using the SPSS output.
First, let’s read in our data. To make reading in the SPSS file easier, I first converted this file in SPSS to a ‘.csv’ file. However, you can still read these files directly into R using the ‘foreign’ package. The code is provided both ways below for your conveience.
library(foreign) #need this for SPSS files
dataset<-read.spss("C:\\PathToFile\\MyDataFile.sav", to.data.frame=TRUE)Reading in data via csv file
Next, for conveience, let’s label males and females
Now, let’s see if ‘Agreeableness’ shows relational bias based on gender for predicting ‘Happiness.’ We will be using the Mendoza and Lautenschlager (1986) flowchart to help guide our analyses.
Flowchart
Let’s set up our four models:
Model 1 (most basic): Y = B0 + B1X
- B2 = B3 = 0
- Test equality of both
##
## Call:
## lm(formula = Happiness ~ Agreeableness, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.2557 -1.9482 0.1422 2.1696 9.1422
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.32878 0.74533 15.200 < 2e-16 ***
## Agreeableness 0.12043 0.03593 3.352 0.000927 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.945 on 248 degrees of freedom
## Multiple R-squared: 0.04335, Adjusted R-squared: 0.03949
## F-statistic: 11.24 on 1 and 248 DF, p-value: 0.0009269
Model 2 (Full Model): Y = B0 + B1X + B2D + B3DX
- B3DX = interaction term for Agreeableness x Gender
##
## Call:
## lm(formula = Happiness ~ Agreeableness + Gender + Agreeableness *
## Gender, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.1289 -1.7842 0.0288 1.9948 8.7489
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.90754 1.22068 9.755 <2e-16 ***
## Agreeableness 0.13883 0.05814 2.388 0.0177 *
## GenderFemale -0.75988 1.51563 -0.501 0.6166
## Agreeableness:GenderFemale -0.03867 0.07269 -0.532 0.5952
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.858 on 246 degrees of freedom
## Multiple R-squared: 0.1065, Adjusted R-squared: 0.09558
## F-statistic: 9.771 on 3 and 246 DF, p-value: 4.098e-06
Model 3: Y = B0 + B1X + B3DX
- B2=0
- Test equality of intercepts
##
## Call:
## lm(formula = Happiness ~ Agreeableness + Agreeableness * Gender,
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.1289 -1.7842 0.0288 1.9948 8.7489
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.90754 1.22068 9.755 <2e-16 ***
## Agreeableness 0.13883 0.05814 2.388 0.0177 *
## GenderFemale -0.75988 1.51563 -0.501 0.6166
## Agreeableness:GenderFemale -0.03867 0.07269 -0.532 0.5952
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.858 on 246 degrees of freedom
## Multiple R-squared: 0.1065, Adjusted R-squared: 0.09558
## F-statistic: 9.771 on 3 and 246 DF, p-value: 4.098e-06
Model 4: Y = B0 + B1X + B2D
- B3=0
- Tests equality of slopes
##
## Call:
## lm(formula = Happiness ~ Agreeableness + Gender, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.2372 -1.8083 0.0625 1.9641 8.7337
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.41175 0.76809 16.159 < 2e-16 ***
## Agreeableness 0.11409 0.03484 3.274 0.00121 **
## GenderFemale -1.54137 0.37224 -4.141 4.75e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.854 on 247 degrees of freedom
## Multiple R-squared: 0.1054, Adjusted R-squared: 0.0982
## F-statistic: 14.56 on 2 and 247 DF, p-value: 1.056e-06
Step 1: Is there reason to suspect bias? (Model 1 versus Model 2).
## Analysis of Variance Table
##
## Model 1: Happiness ~ Agreeableness
## Model 2: Happiness ~ Agreeableness + Gender + Agreeableness * Gender
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 248 2151.6
## 2 246 2009.7 2 141.98 8.6896 0.0002257 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## SSE (Compact) = 2151.63
## SSE (Augmented) = 2009.653
## Delta R-Squared = 0.06312543
## Partial Eta-Squared (PRE) = 0.06598576
## F(2,246) = 8.68964, p = 0.0002256735
As we can see above, the model with Gender and the interaction term is significantly different than our baseline model (Model 1). Here, we can also see our change in R2 & whether or not our F-statistic is significant or not. In this case, our change in R2 is .063, and our F statistic is significant, and thus we proceed to step 2.
Step 2: Are the slopes different? (Model 4 vs Model 2).
## Analysis of Variance Table
##
## Model 1: Happiness ~ Agreeableness + Gender
## Model 2: Happiness ~ Agreeableness + Gender + Agreeableness * Gender
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 247 2012.0
## 2 246 2009.7 1 2.3117 0.283 0.5952
## SSE (Compact) = 2011.965
## SSE (Augmented) = 2009.653
## Delta R-Squared = 0.001027842
## Partial Eta-Squared (PRE) = 0.001148998
## F(1,246) = 0.2829787, p = 0.5952355
Results above suggest that the slopes of model 4 versus 2 were not significant.
Step 3: Since the differences in the prediction of Happiness by gender is not due to differences in slopes, we next test Model 4 vs. Model 1 to determine if the intercepts (means) are different.
## Analysis of Variance Table
##
## Model 1: Happiness ~ Agreeableness
## Model 2: Happiness ~ Agreeableness + Gender
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 248 2151.6
## 2 247 2012.0 1 139.66 17.146 4.754e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## SSE (Compact) = 2151.63
## SSE (Augmented) = 2011.965
## Delta R-Squared = 0.06209759
## Partial Eta-Squared (PRE) = 0.06491134
## F(1,247) = 17.14608, p = 4.753903e-05
As we can see above, there was a change in R2 and the F-statistic is significant. Results suggest that we did detect a difference in intercepts for males and females.
Let’s go ahead and visualize our data to further examine differences in intercepts for males and females. As a reminder, we will be using ‘ggplot2’ and ‘GGally’ to do so. The ‘ggpairs’ command allows us to visualize different aspects of our two variables of interest by gender very easily, where blue represents females & red represents males in the graph below.
ggpairs(data, c(1:3), mapping = aes(color = Gender),
switch = NULL, title = "Happiness & Agreeableness by Gender")## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Overall interpretation:
Based on the various tests conducted using the flowchart, our results suggest that there is bias for males and females for Happiness predicting Agreeableness, and this bias is reflected in the intercept (mean) level differences. In the plot above, we can visually see this discrepancy particularly in the Happiness variable.
Part 2:
The file “HW2_Q2.dat” contains dichotomous (disagree-agree) item response data for 50 items intended to measure anxiety. Import these data into SPSS and then perform the following activities:
- a.) Perform appropriate item and scale analyses and use those results to select 10 items to form a 10-item measure. Briefly describe the criteria you set for inclusion.
- b.) Pick 10 best and 10 worst items from the scale. Give reasons for your answer. Calculate alpha for the new 10-item measure (based on the best items you picked).
Again, let’s go ahead and import our data:
To make it easier, drop ‘ID’ from our dataset - we do not want ID to impact our alpha estimates, as it is irrelevant for this exercise.
- Once we drop the ID variable from our dataset, we can easily calculate alpha and other relevant statistics by using the ‘alpha’ command in the psych package.
##
## Reliability analysis
## Call: alpha(x = new.data)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.85 0.85 0.86 0.1 5.8 0.0071 0.56 0.16 0.098
##
## lower alpha upper 95% confidence boundaries
## 0.84 0.85 0.86
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## V2 0.85 0.85 0.86 0.10 5.7 0.0073 0.0039 0.098
## V3 0.85 0.85 0.86 0.10 5.7 0.0072 0.0042 0.099
## V4 0.84 0.85 0.86 0.10 5.5 0.0074 0.0039 0.095
## V5 0.85 0.85 0.86 0.10 5.6 0.0074 0.0040 0.097
## V6 0.84 0.85 0.86 0.10 5.5 0.0074 0.0039 0.096
## V7 0.84 0.85 0.86 0.10 5.6 0.0074 0.0040 0.096
## V8 0.85 0.85 0.86 0.10 5.7 0.0073 0.0042 0.098
## V9 0.85 0.85 0.86 0.10 5.7 0.0072 0.0042 0.098
## V10 0.85 0.85 0.86 0.11 5.8 0.0072 0.0042 0.099
## V11 0.85 0.85 0.86 0.10 5.6 0.0073 0.0041 0.097
## V12 0.85 0.85 0.86 0.11 5.8 0.0071 0.0042 0.100
## V13 0.85 0.85 0.86 0.11 5.8 0.0071 0.0042 0.100
## V14 0.85 0.85 0.86 0.10 5.7 0.0073 0.0042 0.098
## V15 0.84 0.85 0.86 0.10 5.5 0.0075 0.0038 0.095
## V16 0.85 0.85 0.86 0.11 5.8 0.0072 0.0042 0.100
## V17 0.85 0.85 0.86 0.11 5.8 0.0071 0.0042 0.100
## V18 0.85 0.85 0.86 0.10 5.6 0.0073 0.0041 0.097
## V19 0.85 0.85 0.86 0.11 5.9 0.0071 0.0041 0.101
## V20 0.85 0.85 0.86 0.11 5.8 0.0071 0.0042 0.101
## V21 0.84 0.85 0.86 0.10 5.6 0.0074 0.0040 0.096
## V22 0.85 0.85 0.86 0.11 5.8 0.0072 0.0042 0.100
## V23 0.85 0.85 0.86 0.10 5.6 0.0073 0.0041 0.097
## V24 0.85 0.85 0.86 0.11 5.8 0.0071 0.0042 0.100
## V25 0.85 0.85 0.86 0.11 5.8 0.0072 0.0042 0.099
## V26 0.85 0.85 0.86 0.10 5.6 0.0074 0.0041 0.096
## V27 0.85 0.85 0.86 0.10 5.7 0.0072 0.0042 0.099
## V28 0.85 0.85 0.86 0.10 5.7 0.0072 0.0042 0.099
## V29 0.85 0.85 0.86 0.11 5.8 0.0071 0.0042 0.101
## V30 0.85 0.85 0.86 0.10 5.6 0.0073 0.0042 0.096
## V31 0.85 0.85 0.86 0.10 5.7 0.0072 0.0042 0.099
## V32 0.85 0.85 0.86 0.10 5.6 0.0073 0.0041 0.097
## V33 0.85 0.85 0.86 0.10 5.6 0.0073 0.0041 0.097
## V34 0.85 0.85 0.86 0.11 5.8 0.0071 0.0042 0.100
## V35 0.85 0.85 0.86 0.11 5.9 0.0071 0.0041 0.101
## V36 0.85 0.85 0.86 0.10 5.6 0.0073 0.0041 0.097
## V37 0.85 0.85 0.86 0.10 5.6 0.0073 0.0041 0.097
## V38 0.85 0.85 0.86 0.11 5.9 0.0070 0.0040 0.102
## V39 0.85 0.85 0.86 0.10 5.7 0.0072 0.0042 0.098
## V40 0.85 0.85 0.86 0.10 5.6 0.0073 0.0041 0.098
## V41 0.85 0.85 0.86 0.11 5.8 0.0072 0.0042 0.100
## V42 0.84 0.85 0.86 0.10 5.6 0.0074 0.0040 0.095
## V43 0.85 0.85 0.86 0.11 5.8 0.0071 0.0041 0.101
## V44 0.85 0.85 0.86 0.10 5.7 0.0073 0.0042 0.098
## V45 0.85 0.85 0.86 0.11 5.8 0.0071 0.0042 0.101
## V46 0.85 0.85 0.86 0.10 5.7 0.0073 0.0042 0.098
## V47 0.85 0.85 0.86 0.10 5.7 0.0072 0.0042 0.098
## V48 0.85 0.85 0.86 0.10 5.7 0.0072 0.0042 0.098
## V49 0.85 0.85 0.86 0.10 5.7 0.0072 0.0042 0.099
## V50 0.85 0.85 0.86 0.10 5.7 0.0072 0.0042 0.099
## V51 0.85 0.85 0.86 0.10 5.7 0.0072 0.0042 0.099
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## V2 99 0.35 0.38 0.38 0.337 0.17 0.38
## V3 885 0.30 0.32 0.29 0.262 0.82 0.38
## V4 885 0.52 0.52 0.52 0.476 0.66 0.47
## V5 885 0.45 0.46 0.45 0.407 0.49 0.50
## V6 885 0.53 0.53 0.53 0.488 0.59 0.49
## V7 885 0.48 0.48 0.47 0.429 0.69 0.46
## V8 885 0.39 0.39 0.37 0.342 0.35 0.48
## V9 885 0.36 0.36 0.34 0.310 0.33 0.47
## V10 885 0.29 0.28 0.25 0.228 0.39 0.49
## V11 885 0.45 0.44 0.43 0.397 0.41 0.49
## V12 885 0.27 0.27 0.23 0.211 0.40 0.49
## V13 885 0.25 0.25 0.21 0.193 0.58 0.49
## V14 885 0.38 0.38 0.36 0.331 0.54 0.50
## V15 885 0.58 0.58 0.58 0.536 0.51 0.50
## V16 885 0.27 0.27 0.24 0.216 0.28 0.45
## V17 885 0.23 0.22 0.19 0.168 0.40 0.49
## V18 885 0.45 0.45 0.43 0.402 0.42 0.49
## V19 885 0.18 0.18 0.13 0.118 0.58 0.49
## V20 885 0.23 0.23 0.19 0.175 0.67 0.47
## V21 885 0.51 0.51 0.50 0.460 0.74 0.44
## V22 885 0.28 0.27 0.24 0.218 0.63 0.48
## V23 885 0.44 0.44 0.43 0.396 0.61 0.49
## V24 885 0.26 0.26 0.23 0.207 0.52 0.50
## V25 885 0.30 0.29 0.26 0.235 0.46 0.50
## V26 885 0.46 0.46 0.45 0.412 0.60 0.49
## V27 885 0.32 0.33 0.30 0.274 0.71 0.46
## V28 885 0.35 0.35 0.32 0.294 0.74 0.44
## V29 885 0.21 0.21 0.17 0.150 0.52 0.50
## V30 885 0.43 0.44 0.42 0.388 0.82 0.39
## V31 885 0.30 0.31 0.28 0.256 0.22 0.42
## V32 885 0.41 0.42 0.41 0.371 0.80 0.40
## V33 885 0.42 0.43 0.42 0.384 0.83 0.38
## V34 885 0.24 0.24 0.20 0.184 0.50 0.50
## V35 885 0.19 0.18 0.14 0.124 0.55 0.50
## V36 885 0.46 0.46 0.45 0.411 0.73 0.45
## V37 885 0.44 0.44 0.42 0.388 0.51 0.50
## V38 885 0.16 0.15 0.10 0.091 0.57 0.50
## V39 885 0.34 0.34 0.31 0.285 0.26 0.44
## V40 885 0.40 0.41 0.40 0.362 0.87 0.34
## V41 885 0.27 0.27 0.23 0.212 0.62 0.49
## V42 885 0.51 0.51 0.51 0.469 0.55 0.50
## V43 885 0.20 0.20 0.16 0.142 0.59 0.49
## V44 885 0.38 0.38 0.35 0.324 0.75 0.44
## V45 885 0.23 0.22 0.18 0.163 0.57 0.49
## V46 885 0.38 0.38 0.36 0.335 0.18 0.38
## V47 885 0.32 0.33 0.31 0.280 0.21 0.41
## V48 885 0.33 0.33 0.30 0.273 0.59 0.49
## V49 885 0.31 0.32 0.29 0.265 0.80 0.40
## V50 885 0.32 0.31 0.28 0.259 0.69 0.46
## V51 885 0.31 0.31 0.28 0.257 0.64 0.48
##
## Non missing response frequency for each item
## 0 1 miss
## V2 0.83 0.17 0.89
## V3 0.18 0.82 0.00
## V4 0.34 0.66 0.00
## V5 0.51 0.49 0.00
## V6 0.41 0.59 0.00
## V7 0.31 0.69 0.00
## V8 0.65 0.35 0.00
## V9 0.67 0.33 0.00
## V10 0.61 0.39 0.00
## V11 0.59 0.41 0.00
## V12 0.60 0.40 0.00
## V13 0.42 0.58 0.00
## V14 0.46 0.54 0.00
## V15 0.49 0.51 0.00
## V16 0.72 0.28 0.00
## V17 0.60 0.40 0.00
## V18 0.58 0.42 0.00
## V19 0.42 0.58 0.00
## V20 0.33 0.67 0.00
## V21 0.26 0.74 0.00
## V22 0.37 0.63 0.00
## V23 0.39 0.61 0.00
## V24 0.48 0.52 0.00
## V25 0.54 0.46 0.00
## V26 0.40 0.60 0.00
## V27 0.29 0.71 0.00
## V28 0.26 0.74 0.00
## V29 0.48 0.52 0.00
## V30 0.18 0.82 0.00
## V31 0.78 0.22 0.00
## V32 0.20 0.80 0.00
## V33 0.17 0.83 0.00
## V34 0.50 0.50 0.00
## V35 0.45 0.55 0.00
## V36 0.27 0.73 0.00
## V37 0.49 0.51 0.00
## V38 0.43 0.57 0.00
## V39 0.74 0.26 0.00
## V40 0.13 0.87 0.00
## V41 0.38 0.62 0.00
## V42 0.45 0.55 0.00
## V43 0.41 0.59 0.00
## V44 0.25 0.75 0.00
## V45 0.43 0.57 0.00
## V46 0.82 0.18 0.00
## V47 0.79 0.21 0.00
## V48 0.41 0.59 0.00
## V49 0.20 0.80 0.00
## V50 0.31 0.69 0.00
## V51 0.36 0.64 0.00
- In this command, ‘r.drop’ = Item whole correlation for this item against the scale without this item.
- Similarly, ‘r.cor’ = Item whole correlation corrected for item overlap and scale reliability - This is our item discrimination parameter.
- Raw.r = The correlation of each item with the total score, not corrected for item overlap.
- Consider dropping or revising items with r.cor lower than .30
Which items should we pick?
We want to balance easy and difficult items - but also don’t want items with too low of total item correlations or alpha values. An easy way to organize all this information is to sort the output from the ‘alpha’ command into a dataframe that we can easily sort by ‘ascending’ or ‘descending.’ To do so, we first export the information we need from ‘alpha.’ To make it easy, let’s first save our alpha output as some sort of object in our environment.
The nice thing about many of these R commands is that they allow you to extract specific information from the output. When we use the ‘$’ with our created alpha ‘value’ we can select which aspect of the output we would like to save (see the picture below for the options).In this example, ‘item.stats’ will provide us the most benefical information. We can then save this again as a new object with the info we are interested in. The command below will automatically convert it to a dataframe, allowing the output to be easily organized and visualized (again, you can name these objects whatever you like).
View this information in a dataframe. This allows us to easily sort our data and view it in a seperate window.
We can now sort and look at different parameters of interest to form a new 10-item measure. While there are technically multiple approaches for doing so, I prioritized 5 items that were ‘easier’ (i.e., these items had a higher mean/frequency value). Out of the easier items, I then selected items that weren’t too easier (i.e., items with a mean of .80 or higher were not selected), and I also looked for items that had decent item discrimination (i.e., higher ‘r.cor’ values). I then selected 5 more ‘difficult’ items based on the same criteria above. The 10 items I selected are presented below:
| Easier Items | Harder Items |
|---|---|
| Item #7: r.cor = .47; mean = .69 | Item #8: r.cor = .37; mean = .35 |
| Item #4: r.cor = .52; mean = .66 | Item #9: r.cor = .34; mean = .33 |
| Item #23: r.cor = .43; mean = .61 | Item #18: r.cor = .43; mean = .42 |
| Item #26: r.cor = .45; mean = .60 | Item #11: r.cor = .43; mean = .41 |
| Item #21: r.cor = .50; mean = .74 | Item #39: r.cor = .31; mean = .26 |
For part ‘b’ of this question - now let’s also select the ‘10 worst’ and ‘10 best’ items, subset these from our original dataset, and compute a new alpha value for each new set of items.
For the 10 worst items, I selected items: 38, 19, 35, 43, 29, 45, 17, 20, 34, & 13.
- All of these items had total item correlation values (r.cor in output) below 0.22. For good items, we ideally want values above .30 at the minimum.
- Although these items had a decent amount of people getting the item correct, they are not still ideal because they had poor item discrimination.
For the 10 best items, I selected items: 15, 6, 4, 42, 21, 7, 36, 5, 26, & 18.
- These items had high item discrimination (high r.cor values) & so had around .4 to around .7 or so proportion of people who got the item correct. (So items were not too easy or too difficult).
Now let’s go ahead and subset our original datset so we can easily compute alpha for our 10 best & 10 worst items.
worst.items<- new.data[,c("V38","V19", "V35", "V43", "V29","V45","V17","V20","V34","V13")]
best.items<- new.data[,c("V15","V6", "V4", "V42", "V21","V7","V36","V5","V26","V18")]
Calculating new alpha values for worst items:
- Our raw/std. alpha for worst items is 0.21
Calculating new alpha values for best items:
- Our raw/std. alpha for best items is 0.76