library(readxl)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
district<-read_excel("district.xls")
new_df<-district %>% select("DISTRICT","DPETSPEP","DPFPASPEP")
summary(new_df$DPETSPEP)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 9.90 12.10 12.27 14.20 51.70
summary(new_df$DPFPASPEP)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.000 5.800 8.900 9.711 12.500 49.000 5
compare_two<-district %>% select(DISTNAME,DPETSPEP,DPFPASPEP)
ggplot(compare_two,aes(DPETSPEP,DPFPASPEP)) + geom_point()
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).

newdf2<-new_df %>% na.omit(.)
cor(newdf2$DPETSPEP,newdf2$DPFPASPEP)
## [1] 0.3700234
- Which variable has missing values? Min
- remove the missing observations. How many are left overall? 8
- Create a point graph (hint: ggplot + geom_point()) to compare
DPFPASPEP and DPETSPEP. Are they correlated? If I am reading this
correctly, the funds do not look like they are allocated based off how
many students are in special ED, or is it the students get less per
person the more they have?
- Do a mathematical check (cor()) of DPFPASPEP and DPETSPEP. What is
the result? 0.3700234
- How would you interpret these results? (No real right or wrong
answer – just tell me what you see) I feel like the spending is
dependent on the district? I do not know if I understand exactly what
the plot is attempting to show or what the mathematical check is for.
What I see is that the percentage of special ed, is almost the same as
the amount spent.