I chose to break use the WV checkbook to further break down spending into categories. My thesis is that if schools spend more to attract teachers(higher pay, better benefits, etc) then proficiency will rise in that school. The Categories I chose to look further into were:
I was interested in more specific spending by county but still used the enrollment numbers from this table to be able to get spending based off of number of students.
## # A tibble: 55 × 8
## name enroll tfedrev tstrev tlocrev totalexp ppcstot county
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 BARBOUR CO SCH DIST 2144 7559 16584 5872 28021 11885 Barbour
## 2 BERKELEY CO SCH DIST 19722 48407 140127 86699 264253 12704 Berkeley
## 3 BOONE CO SCH DIST 3177 8194 26858 14564 48642 14663 Boone
## 4 BRAXTON CO SCH DIST 1747 5479 12748 6404 24417 13153 Braxton
## 5 BROOKE CO SCH DIST 2582 6791 17114 21352 41908 15642 Brooke
## 6 CABELL CO SCH DIST 11667 42518 88337 66699 183621 14538 Cabell
## 7 CALHOUN CO SCH DIST 861 3254 9953 3190 15154 16085 Calhoun
## 8 CLAY CO SCH DIST 1669 6157 17655 2791 25963 13825 Clay
## 9 DODDRIDGE CO SCH DIST 1082 3455 3999 31752 38493 23563 Doddrid…
## 10 FAYETTE CO SCH DIST 5594 15293 51759 23477 83373 13777 Fayette
## # ℹ 45 more rows
I used the checkbook to individually download each county’s spending data into a csv and used a function to make a table of all csv files in my folder automatically, this helped when I went from originally using 22 counties to using all counties of the state.
##
## The downloaded binary packages are in
## C:\Users\cgoum\AppData\Local\Temp\RtmpQ9D32k\downloaded_packages
## package 'maps' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\cgoum\AppData\Local\Temp\RtmpQ9D32k\downloaded_packages
## package 'viridis' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\cgoum\AppData\Local\Temp\RtmpQ9D32k\downloaded_packages
## teachers technology books proficiency
## teachers 1.00000000 -0.02174409 0.2404514 0.3045906
## technology -0.02174409 1.00000000 -0.1525747 -0.1938212
## books 0.24045142 -0.15257468 1.0000000 0.1783361
## proficiency 0.30459062 -0.19382121 0.1783361 1.0000000
## Decision Tree
## [1] "Mean Squared Error Train: 45.5"
## [1] "Mean Squared Error Test: 9.64"
High error in decision tree was not optimal and I felt limited that the spending categories while broken up still felt very broad. Being able to break the data to a school level would be interesting to see if there are any traits that all good schools specifically share.
Schools that spend more money on teachers generally do have higher proficiency. Schools with better teachers make the increased book spending have a further positive impact. Higher pay gets teachers in the door but it is how they make an impact once they get through the doors that drives positive effects on proficiency.
WV Checkbook used to get all spending data for all counties
Copilot used on graphs to get titles and themes