I donāt quite understand this in practice, apparently. And need to go back and study a lot more and practice different things - I think Iām getting lost in some basics. I decided to use the district data because a) I am currently in Capstone and we are doing qualitative research, and b) this doesnāt come naturally to me, so I thought I would use the one that would be simplest to learn as you obviously have experience and familiarity with it already.
library(readxl)
library(tidyverse)
## āā Attaching core tidyverse packages āāāāāāāāāāāāāāāāāāāāāāāā tidyverse 2.0.0 āā
## ā dplyr 1.1.4 ā readr 2.1.5
## ā forcats 1.0.0 ā stringr 1.5.1
## ā ggplot2 3.5.2 ā tibble 3.3.0
## ā lubridate 1.9.4 ā tidyr 1.3.1
## ā purrr 1.1.0
## āā Conflicts āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā tidyverse_conflicts() āā
## ā dplyr::filter() masks stats::filter()
## ā dplyr::lag() masks stats::lag()
## ā¹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
district<-read_excel("district.xls")
summary(district$DZCAMPUS)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 3.000 7.428 5.000 273.000
summary(district$DPSTTOSA)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 36081 50439 53382 53971 56919 110560 4
summary(district$DPETALLC)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.0 337.5 884.0 4476.3 2746.0 193727.0
hist(district$DPETALLC)
plot(district$DPSSTOSA,district$DZCAMPUS)
cor(district$DPSTTOSA,district$DPSSTOSA)
## [1] NA