library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
teacher_data <- read_csv("Teacher_Hiring_Certification_Turnover.csv")
## Rows: 33 Columns: 25
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): REGION, distname, geotype_new, region_lea, Year
## dbl (20): district, schyr, intern, other_temp, oos_std, lag_starter, no_cert...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
teacher_data <- teacher_data %>% rename(teacher_attrition = turnover_rate_teachers)
Variable Definitions:
(Dependent) teacher_attrition: indicates whether a teacher left their position within a given time academic year
(Independent) beg_year: teachers who are in their first year of teaching experience
1-5_years: teachers with 1 to 5 years of teaching experience
6-10_years: teachers with 6 to 10 years of teaching experience
11-20_years: 11 to 20 years of teaching experience
over20_years: teachers with over 20 years of teaching experience
cor(teacher_data$beg_year, teacher_data$teacher_attrition)
## [1] 0.5663391
cor(teacher_data$`1-5_years`, teacher_data$teacher_attrition)
## [1] 0.2895374
cor(teacher_data$`6-10_years`, teacher_data$teacher_attrition)
## [1] 0.3116338
cor(teacher_data$`11-20_years`, teacher_data$teacher_attrition)
## [1] 0.327162
cor(teacher_data$over20_years, teacher_data$teacher_attrition)
## [1] 0.1164551
pairs(~ beg_year + `1-5_years` + `6-10_years` + `11-20_years` + over20_years + teacher_attrition, data = teacher_data)
cor.test(teacher_data$beg_year,teacher_data$teacher_attrition, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: teacher_data$beg_year and teacher_data$teacher_attrition
## t = 3.826, df = 31, p-value = 0.0005911
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2768597 0.7615755
## sample estimates:
## cor
## 0.5663391
The Pearson correlation analysis showed a coefficient of about 0.5663, indicating a moderate positive relationship between “beg_year” and “teacher_attrition”. The p-value of 0.0005911 indicates that this correlation is statistically significant, suggesting that more recent teachers are likely to leave their positions.
Using Pearson’s correlation makes sense in this case because both “beg_year” and “teacher_attrition” are continuous variables that show a linear relationship.