California Test Scores

library(dplyr)

CASchools <- read.csv("http://uclspp.github.io/PUBLG100/data/CA_Schools.csv")

CASchools <- CASchools %>%
  mutate(score = (read+ math) / 2, STR = students / teachers)

model1 <- lm(score ~ STR, data = CASchools)
summary(model1)

## 
## Call:
## lm(formula = score ~ STR, data = CASchools)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -47.727 -14.251   0.483  12.822  48.540 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 698.9329     9.4675  73.825  < 2e-16 ***
## STR          -2.2798     0.4798  -4.751 2.78e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 18.58 on 418 degrees of freedom
## Multiple R-squared:  0.05124,    Adjusted R-squared:  0.04897 
## F-statistic: 22.58 on 1 and 418 DF,  p-value: 2.783e-06

plot(score ~ STR, data = CASchools)
abline(model1, col = "blue")

Let’s just double everyone’s score

CASchools <- CASchools %>%
  mutate(score = score * 2)

model2 <- lm(score ~ STR, data = CASchools)
summary(model2)

## 
## Call:
## lm(formula = score ~ STR, data = CASchools)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -95.453 -28.501   0.965  25.644  97.081 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1397.8659    18.9350  73.825  < 2e-16 ***
## STR           -4.5596     0.9597  -4.751 2.78e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 37.16 on 418 degrees of freedom
## Multiple R-squared:  0.05124,    Adjusted R-squared:  0.04897 
## F-statistic: 22.58 on 1 and 418 DF,  p-value: 2.783e-06

plot(score ~ STR, data = CASchools)
abline(model2, col = "red")

The coefficients of model2 should be twice the coefficients of model1

coefficients(model1)

## (Intercept)         STR 
##  698.932949   -2.279808

coefficients(model2)

## (Intercept)         STR 
## 1397.865899   -4.559616

model2 residuals should be twice the residuals for model1

# just look at the first few
head(residuals(model1))

##         1         2         3         4         5         6 
##  32.65260  11.33917 -12.70686 -11.66198 -15.51590 -44.58079

head(residuals(model2))

##         1         2         3         4         5         6 
##  65.30520  22.67834 -25.41371 -23.32396 -31.03179 -89.16158

However, The relative effect of 1 unit increase in STR on scores remains constant, even though the absolute numbers doubled.

In model1: An increase of 1 in STR results is -2.2798081 unit change in test scores or -0.284976 %

In model2: An increase of 1 in STR results is -4.5596163 unit change in test scores or still -0.284976 %