Lab 02 - Confidence Intervals

CPP 523

Yuyu Bogdan


In this assignment you will be working with simulated data on class size and test scores. There is also a variable for socio-economic status and teacher quality. We would like to understand how the relationship between class size and test scores changes when other variables are considered. We will be looking specifically at the confidence interval around the classroom size slope estimate.

We estimate the following models:

\(TestScore = b_0 + b_1 \cdot ClassSize + e_1 \ \ \ (Model \ 1)\)

\(TestScore = b_0 + b_1 \cdot ClassSize + b_2 \cdot TeacherQuality + e_2 \ \ \ (Model \ 2)\)

\(TestScore = b_0 + b_2 \cdot TeacherQuality + b_3 \cdot SES + e_3 \ \ \ (Model \ 3)\)

\(TestScore = b_0 + b_1 \cdot ClassSize + b_3 \cdot SES + e_4 \ \ \ (Model \ 4)\)

\(TestScore = B_0 + B_1 \cdot ClassSize + B_2 \cdot TeacherQuality + B_3 \cdot SES + \epsilon \ \ \ (Model \ 5)\)


Dependent Variable: Test Scores
Model 1 Model 2 Model 3 Model 4 Model 5
(1) (2) (3) (4) (5)
Classroom Size -4.22*** -3.91*** -2.67 -2.22***
(0.18) (0.03) (1.63) (0.23)
Teacher Quality 55.01*** 55.03*** 55.01***
(0.25) (0.26) (0.25)
Socio-Economic Status 40.94*** 16.34 17.77***
(0.27) (17.10) (2.40)
Intercept 738.34*** 456.70*** 272.91*** 665.29*** 377.26***
(4.88) (1.48) (1.39) (76.57) (10.82)
Observations 1,000 1,000 1,000 1,000 1,000
Adjusted R2 0.36 0.99 0.99 0.36 0.99
Standard errors in parentheses p<0.1; p<0.05; p<0.01


Lab-02 Questions:

Warm-up: Interpret the slope associated with Class Size in Model 01. What does a slope of -4.22 mean in this context? Is the negative sign a good thing or a bad thing?

Answer: The slope of r means as class size increase, the change in students test score. The negative sign is a good thing because it means the smaller class size, the higher test scores.

Q (1)

What is the standard error associated with the slope on class size in Model 1?

Answer: se=0.18

Q (2)

Calculate the 95% confidence interval around the class size coefficient in Model 1. Is it statistically significant at this level? How do you know?

Answer b1 - 1.96se < CI < b1 + 1.96se -4.22-1.960.18< CI < -4.22+1.960.18 -4.5728< CI < -3.8672 it is statistically significant since the confidence interval does not contain 0

Visual

Q (3)

Calculate the 95% confidence interval around the class size coefficient in Model 2. Is it significant at this level? How do you know?

Answer -3.91-1.960.03< CI < -3.91+1.960.03 3.9688<CI<-3.8512 It is significant at this level since the CI does not contain 0

Visual

Q (4)

Calculate the 95% confidence interval around the class size coefficient in Model 4. Is it significant at this level? How do you know?

Answer -2.67-1.961.63<CI<-2.67+1.961.63 -5.8628<CI<0.5248 It is not significant at this level because CI contains 0.

Visual

Q (5)

Draw the three confidence intervals to see how they change as a result of the controls included in the model.

Which model has the “largest” slope? Note that the slope represents program impact, in this case how much test scores improve for as you reduce average class size by a student. So largest slope, or program effect size, is in absolute terms. A slope of -5, for example, means that test scores improve by 5 points when average class size falls by one student in a state. A slope of -3 means that test scores only improve by 3 points for the one-student reduction in class size. A slope of -3 is larger than -5 in mathematical terms, but when asked about slope size or program effects you should ignore the sign and compare absolute values. The intervention is about reducing class size, so we want a “large” negative slope in this context.

Which model has the smallest standard error? How can you tell?

You can reference the above graphics or re-create them here.

Answer Model 1 has the largest slope since the absolute value of model 1 slope is the greatest among all the slopes. Model 2 has the smallest standard error (0.03)

Q (6)

The covariance of class size and test scores is -418, and the variance of class size is 99. Can you calculate the slope of class size in Model 4 with the formula cov(x,y)/var(x)? Why or why not?

Answer No, slope of class size in model 4 could not be calculated with the formula cov(x,y)/var(x) since there are other independent variables.

Pairs plot for reference for Q6.