districtinfo<-district %>% select(DISTNAME,DPETSPEP,DPFPASPEP)

summary(districtinfo$DPETSPEP)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    9.90   12.10   12.27   14.20   51.70
summary(districtinfo$DPFPASPEP)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.000   5.800   8.900   9.711  12.500  49.000       5
districtinfo %>% filter(DPFPASPEP>0)
## # A tibble: 1,201 × 3
##    DISTNAME                     DPETSPEP DPFPASPEP
##    <chr>                           <dbl>     <dbl>
##  1 CAYUGA ISD                       14.6      28.9
##  2 ELKHART ISD                      12.1       8.8
##  3 FRANKSTON ISD                    13.1       8.4
##  4 NECHES ISD                       10.5      10.1
##  5 PALESTINE ISD                    13.5       6.1
##  6 WESTWOOD ISD                     14.5       9.4
##  7 SLOCUM ISD                       14.7       9.9
##  8 ANDREWS ISD                      10.4      10.9
##  9 PINEYWOODS COMMUNITY ACADEMY     11.6       9.2
## 10 HUDSON ISD                       11.9      10.3
## # ℹ 1,191 more rows
districtinfov2<-districtinfo %>% filter(DPFPASPEP>0)


ggplot(districtinfov2, aes(x=DPFPASPEP, y=DPETSPEP))+ geom_point()

cor(districtinfov2$DPFPASPEP,districtinfov2$DPETSPEP)
## [1] 0.371033

variables are missing from DPFPASEP “money spent on special education” observations reduced from 1207 to 1201 once na’s were removed no real correlation No real correlation between percent special education and money spent on special education. analysis would have to be expanded to how special education funing is allocated