library(readxl)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
District_data <-read_excel("district.xls")
Percent_Meets_STAAR<-District_data$DDA00A001222R
Ratio<-District_data$DPSTKIDR
Turnover<-District_data$DPSTURNR
Experience_Yrs<-District_data$DPSTEXPA
Avg_Salary<-District_data$DPSTTOSA
Relationship_model<-lm(Percent_Meets_STAAR~Ratio+Turnover+Experience_Yrs+Avg_Salary,data=District_data)
summary(Relationship_model)
##
## Call:
## lm(formula = Percent_Meets_STAAR ~ Ratio + Turnover + Experience_Yrs +
## Avg_Salary, data = District_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -52.283 -7.756 -0.931 7.441 43.866
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.120e+01 4.399e+00 9.366 < 2e-16 ***
## Ratio 3.122e-01 1.350e-01 2.313 0.0209 *
## Turnover -4.079e-01 3.692e-02 -11.050 < 2e-16 ***
## Experience_Yrs 8.464e-01 1.279e-01 6.617 5.51e-11 ***
## Avg_Salary 1.673e-07 7.486e-05 0.002 0.9982
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.9 on 1193 degrees of freedom
## (9 observations deleted due to missingness)
## Multiple R-squared: 0.2111, Adjusted R-squared: 0.2085
## F-statistic: 79.82 on 4 and 1193 DF, p-value: < 2.2e-16
This model evaluates the relationship between the percentage of studentst that meet grade level on STAAR with multiple variables from the District Data.
The relationship between turnover, years of teaching experience, and student performance seem to be the most statistically significant based on the P values being very close to zero.
The adjusted R square shows that the model only accounts for about 21% of the variance between variables.
plot(Relationship_model,which=1)
The model seems to be mostly linear as the line is nearly flat but the
data spreads largely away from the line in a grouped area.