R Markdown

DPSTEXPA - average years of teacher experience DPSTTOSA - average Teacher salary DA0CSA21R - average SAT score DDA00A001S22R - average STAAR pass rate 1) summary(data$x)

  1. hist(data$x) - for continuous variables

  2. plot(data\(x,data\)y) - to compare variables

(ggplot is fine if you are comfortable with it)

  1. cor(data\(x,data\)y) - to see a correlation between two variables
library(readxl)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## âś” dplyr     1.1.4     âś” readr     2.1.5
## âś” forcats   1.0.0     âś” stringr   1.5.1
## âś” ggplot2   3.5.1     âś” tibble    3.2.1
## âś” lubridate 1.9.4     âś” tidyr     1.3.1
## âś” purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## âś– dplyr::filter() masks stats::filter()
## âś– dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
district<-read_excel("district.xls")
summary(district$DPSTEXPA)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    0.00   10.07   12.00   11.75   13.90   22.90       3
summary(district$DA0CSA21R)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    -1.0   887.0   973.0   823.9  1039.0  1344.0     262
hist(district$DPSTEXPA)

```{r}plot(district\(DPSTEXPA, district\)DA0CSA21R)




``` r
hist(district$DA0CSA21R)

hist(district$DPSTTOSA)

plot(district$DPSTEXPA,district$DA0CSA21R)

plot(district$DPSTTOSA,district$DA0CSA21R)

new_district<-district%>%select(DPSTEXPA,DA0CSA21R,DDA00A001S22R)%>%na.omit(.)
cor(new_district$DPSTEXPA,new_district$DA0CSA21R)
## [1] -0.1079977
cor(new_district$DPSTEXPA,new_district$DDA00A001S22R)
## [1] 0.3728339

??? For Prof: The error above, is it because there are factors or variables that are not numbers so it can’t compare them?