Load assessment data

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Loading required package: lattice
## 
## 
## Attaching package: 'caret'
## 
## 
## The following object is masked from 'package:purrr':
## 
##     lift
## 
## 
## 
## Attaching package: 'janitor'
## 
## 
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
## # A tibble: 476 × 4
##    district population_group subgroup                                proficiency
##    <chr>    <chr>            <chr>                                         <dbl>
##  1 Barbour  Gender           Female                                        20.6 
##  2 Barbour  Gender           Male                                          22.9 
##  3 Barbour  Race/Ethnicity   Multi-Racial                                  35.7 
##  4 Barbour  Race/Ethnicity   White                                         21.7 
##  5 Barbour  Student Status   Economically Disadvantaged                    15.5 
##  6 Barbour  Student Status   Foster Care                                   14.0 
##  7 Barbour  Student Status   Special Education (Students with Disab…        5.59
##  8 Barbour  Total Population Total                                         21.8 
##  9 Berkeley Gender           Female                                        24.7 
## 10 Berkeley Gender           Male                                          24.9 
## # ℹ 466 more rows

Load spending data

## # A tibble: 55 × 10
##    School_Name           enroll Federal_Revenue State_Revenue Local_Revenue
##    <chr>                  <dbl>           <dbl>         <dbl>         <dbl>
##  1 BARBOUR CO SCH DIST     2144            7559         16584          5872
##  2 BERKELEY CO SCH DIST   19722           48407        140127         86699
##  3 BOONE CO SCH DIST       3177            8194         26858         14564
##  4 BRAXTON CO SCH DIST     1747            5479         12748          6404
##  5 BROOKE CO SCH DIST      2582            6791         17114         21352
##  6 CABELL CO SCH DIST     11667           42518         88337         66699
##  7 CALHOUN CO SCH DIST      861            3254          9953          3190
##  8 CLAY CO SCH DIST        1669            6157         17655          2791
##  9 DODDRIDGE CO SCH DIST   1082            3455          3999         31752
## 10 FAYETTE CO SCH DIST     5594           15293         51759         23477
## # ℹ 45 more rows
## # ℹ 5 more variables: Total_Expenditures <dbl>, Total_Current_Spending <dbl>,
## #   Instruction_Spending <dbl>, Support_Services_Spending <dbl>, county <chr>

Load demographic data

## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
##   dat <- vroom(...)
##   problems(dat)
## Rows: 62 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): County, FIPS, Rank within US (of 3143 counties)
## dbl (2): Value (Percent), People (Unemployed)
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## # A tibble: 55 × 2
##    county   percent_unemployed
##    <chr>                 <dbl>
##  1 Mcdowell               15.1
##  2 Braxton                14.4
##  3 Logan                  13.3
##  4 Calhoun                12.2
##  5 Roane                  11.7
##  6 Clay                   11.2
##  7 Mingo                  11.2
##  8 Webster                11.1
##  9 Monroe                 10.6
## 10 Barbour                10.1
## # ℹ 45 more rows

Joined data

Correlations

Linear Regression Model