DPSTEXPA - average years of teacher experience DPSTTOSA - average Teacher salary DA0CSA21R - average SAT score DDA00A001S22R - average STAAR pass rate 1) summary(data$x)
hist(data$x) - for continuous variables
plot(data\(x,data\)y) - to compare variables
(ggplot is fine if you are comfortable with it)
library(readxl)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## âś” dplyr 1.1.4 âś” readr 2.1.5
## âś” forcats 1.0.0 âś” stringr 1.5.1
## âś” ggplot2 3.5.1 âś” tibble 3.2.1
## âś” lubridate 1.9.4 âś” tidyr 1.3.1
## âś” purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## âś– dplyr::filter() masks stats::filter()
## âś– dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
district<-read_excel("district.xls")
summary(district$DPSTEXPA)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.00 10.07 12.00 11.75 13.90 22.90 3
summary(district$DA0CSA21R)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## -1.0 887.0 973.0 823.9 1039.0 1344.0 262
hist(district$DPSTEXPA)
```{r}plot(district\(DPSTEXPA,
district\)DA0CSA21R)
``` r
hist(district$DA0CSA21R)
hist(district$DPSTTOSA)
plot(district$DPSTEXPA,district$DA0CSA21R)
plot(district$DPSTTOSA,district$DA0CSA21R)
new_district<-district%>%select(DPSTEXPA,DA0CSA21R,DDA00A001S22R)%>%na.omit(.)
cor(new_district$DPSTEXPA,new_district$DA0CSA21R)
## [1] -0.1079977
cor(new_district$DPSTEXPA,new_district$DDA00A001S22R)
## [1] 0.3728339
??? For Prof: The error above, is it because there are factors or variables that are not numbers so it can’t compare them?