1. Regress X2 Mathematics standardized theta score (X2TXMTSCOR) on X1 Soci-economic status composite (X1SES) and X1 Scale of student’s mathematics self-efficacy (X1MTHEFF) (5).
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 51.510461 | 0.2998145 | 171.807789 | 0 |
| X1SES | 5.471501 | 0.3971550 | 13.776739 | 0 |
| X1MTHEFF | 2.428152 | 0.2939791 | 8.259608 | 0 |
2. Interpret the output generated from question 1 including R squared and any statistically significant independent variables (5).
A regression analysis was done to study the relationship between the dependent variable X2 standardized theta score (X2TXMTSCOR) and the independent variables X1 Soci-economic status composite (X1SES) and X1 Scale of student’s mathematics self-efficacy (X1MTHEFF). A statistically significant relationship was found for both dependent variables. As X1SES increases by 1 unit, we can expect X2 Math Score to increase by 5.47 (p < 0.001) holding the other variable constant. As X1MTHEFF increases by one unit, we can expect X2 Math Score to increase by 2.43 holding the other variable constant (p < 0.001). The constant, 51.5, is the X2 math score we would expect for someone whose X1SES and X1MTHEFF are both 0.
| Model 1 | |
|---|---|
| (Intercept) | 51.510 |
| (0.300) | |
| X1SES | 5.472 |
| (0.397) | |
| X1MTHEFF | 2.428 |
| (0.294) | |
| Num.Obs. | 884 |
| R2 | 0.259 |
| R2 Adj. | 0.257 |
| AIC | 6362.6 |
| BIC | 6381.8 |
| Log.Lik. | -3177.312 |
| F | 153.890 |
3. Generate ANOVA output with X2 Mathematics standardized theta score (X2TXMTSCOR) as the dependent variable and X1 School locale (urbanicity) (X1LOCALE) as the factor variable. (5)
| term | df | sumsq | meansq | statistic | p.value |
|---|---|---|---|---|---|
| X1LOCALE_f | 3 | 1212.104 | 404.0348 | 3.81275 | 0.0098445 |
| Residuals | 996 | 105545.518 | 105.9694 | NA | NA |
Since we have a statistically significant p value from the anova, we can and should do a post-hoc analysis to determine which group (or groups) are different from the others. My hypothesis is that the City and Town will be different from one another, although I cannot say if there is a statistically significant difference between the other groups. Let’s run a Tukey HSD post-hoc test to see.
It turns out that city is statistically significantly different from both town (p <0.05) and rural (p < 0.05).
| term | contrast | null.value | estimate | conf.low | conf.high | adj.p.value |
|---|---|---|---|---|---|---|
| X1LOCALE_f | Suburb-City | 0 | -0.5972363 | -2.738983 | 1.5445103 | 0.8901180 |
| X1LOCALE_f | Town-City | 0 | -2.9765954 | -5.933855 | -0.0193363 | 0.0478197 |
| X1LOCALE_f | Rural-City | 0 | -2.3044177 | -4.553793 | -0.0550426 | 0.0422481 |
| X1LOCALE_f | Town-Suburb | 0 | -2.3793591 | -5.272982 | 0.5142639 | 0.1486581 |
| X1LOCALE_f | Rural-Suburb | 0 | -1.7071814 | -3.872213 | 0.4578499 | 0.1779278 |
| X1LOCALE_f | Rural-Town | 0 | 0.6721777 | -2.301988 | 3.6463438 | 0.9376256 |
4. Generate ANCOVA output with X2 Mathematics standardized theta score (X2TXMTSCOR) as the dependent variable, X1 School locale (urbanicity) (X1LOCALE) as the factor variable, and include X1 Mathematics standardized theta score (X1TXMTSCOR) as a Covariate. (5)
| Effect | DFn | DFd | F | p | p<.05 | ges |
|---|---|---|---|---|---|---|
| X1TXMTSCOR | 1 | 995 | 1250.553 | 0.000 |
|
0.557 |
| X1LOCALE_f | 3 | 995 | 1.917 | 0.125 | 0.006 |
5. What function does the inclusion of X1 Mathematics standardized theta score (X1TXMTSCOR) in the ANCOVA analysis serve? (5)
In question 3, the model we ran had locale as the sole dependent variable in the model. The coefficient on that model was statistically significant. In the model in question 4, we conducted a hypothesis test to see if the coeff. on locale continues to be a statistically significant when adjusting for X1 theta math score. The assumption here is that including X1 math scores in the calculation then allows us to remove endogeneity and better estimate the true effect of locale on test scores.
6. Interpret the ANOVA output from question 3. (5)
An ANOVA was run to determine if locale was a statistically significant predictor of students’ X2 Mathematics standardized theta score. The anova determined that there was a statistically significant difference between groups F(3, 996) = 404.0348, p < 0.01. Therefore we reject the null hypothesis that all groups are equal.
As a follow up, a Tukey HSD post-hoc analysis determined there was a statistically significant difference in math scores between students in the City group vs Town group (p < 0.05) as well as the City group vs the Rural group (p < 0.05).
The mean difference in X2 Mathematics theta scores between the city group and town group is 2.98. This means that students in the city would be expected to have, on average, an X2 Mathematics theta score that is 2.98 units higher than students who live in town.
The mean difference in X2 Mathematics theta scores between city students and rural students is 2.30. This means that the students who live in the city can be expected to have an X2 Mathematics theta score that is 2.30 units higher than rural students, on average.
7. Interpret the ANCOVA output from question 4. (5)
An ANCOVA was run to determine the difference in mean X2 mathematics standardized theta score between students in different locales after adjusting for X1 mathematics standardized theta score. After adjustment, there was not a statistically significant difference in X2 Mathematics standardized theta score among students by locale F(3, 995) = 1.917, p = 0.125. Since the p value is not statistically significant, we do not run a post-hoc analysis.
8. How do you explain differences in the p-values or statistical significance across the analyses conducted in questions 3 & 4, if any? (5)
In the anova, the p-value (0.009) was statistically significant because we weren’t controlling for any covariates. Adding the covariate X1 Math Score removed the statistical significance because X1 math score explained variation in X2 Math Score. Furthermore, we cannot draw any conclusions about causality because X1 math score and locale may affect each other and it’s difficult to say which one causes the other or if they’re just simply correlated.
9. Generate a frequency table for the following variable: X1 School locale (urbanicity) (X1LOCALE). Also, generate regression output which uses X2 Mathematics standardized theta score (X2TXMTSCOR) as the dependent variable and X1 School locale (urbanicity) (X1LOCALE) as an independent variable. (5)
| group | n | prop |
|---|---|---|
| City | 283 | 28.3 |
| Suburb | 333 | 33.3 |
| Town | 112 | 11.2 |
| Rural | 272 | 27.2 |
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 53.8080449 | 0.741652 | 72.551611 | 0.0000000 |
| X1LOCALE | -0.8591989 | 0.280820 | -3.059607 | 0.0022755 |
10. Using the output from question 9, explain why the approach used has generated output that is not appropriate to be interpreted. Cite at least two specific problems with the data and/or the analysis conducted. (5)
The X1LOCALE variable is coded as a continuous variable in SPSS, but it is actually a factor with 4 different levels (city, town, rural and suburb). Running the model in question 9 assumes that the relationship between X1 Math score and locale is linear with locale being a continuous variable. In actuality, locale is a categorical variable and each locale has its own relationship with X1 Math score.