Is there a significant difference in the average annual Overtime Pay earned by employees in the Department of Police compared to the Fire and Rescue Services?
This analysis utilizes the Employee Salaries dataset sourced from the Montgomery County Open Data Portal. The original dataset contains 10,632 observations tracking county workforce earnings. This study focuses specifically on two departments to compare compensation profiles:
Department Name: Categorical variable used to group
observations (Police vs. Fire and Rescue).2025 Overtime Pay: Quantitative variable representing
the numerical annual dollar amount earned in overtime by each
employee.# Load dataset
salary_data <- read.csv("Employee_Salaries_-_2025_20260421.csv")
# 2 Required EDA functions to understand the data
str(salary_data)
## 'data.frame': 10632 obs. of 8 variables:
## $ Department : chr "ABS" "CUS" "DGS" "CEX" ...
## $ Department.Name : chr "Alcohol Beverage Services" "Community Use of Public Facilities" "Department of General Services" "Offices of the County Executive" ...
## $ Division : chr "ABS 85 IT Administration" "CUS 70 Finance and Administrative Support Team" "DGS 36 Fleet Management Services" "CEX 15 Chief Administrative Officer's Office" ...
## $ Gender : chr "M" "F" "F" "F" ...
## $ Base.Salary : chr "$174,641.65" "$146,133.24" "$90,149" "$207,000" ...
## $ X2025.Overtime.Pay : chr "$0" "$0" "$651.7" "$0" ...
## $ X2025.Longevity.Pay: chr "$0" "$0" "$2,884.02" "$0" ...
## $ Grade : chr "M3" "M3" "16" "EX2" ...
summary(salary_data)
## Department Department.Name Division Gender
## Length:10632 Length:10632 Length:10632 Length:10632
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
## Base.Salary X2025.Overtime.Pay X2025.Longevity.Pay Grade
## Length:10632 Length:10632 Length:10632 Length:10632
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
# Clean and subset using 3 dplyr functions (filter, mutate, select)
cleaned_data <- salary_data %>%
# 1. Filter for only the two target departments
filter(Department.Name %in% c("Department of Police", "Fire and Rescue Services")) %>%
# 2. Convert character currency ("$1,234.56") into raw clean numeric format
mutate(Overtime_Pay = as.numeric(gsub("[\\$,]", "", X2025.Overtime.Pay))) %>%
# 3. Select only the necessary columns for the analysis
select(Department.Name, Overtime_Pay) %>%
# Remove missing rows if any exist
filter(!is.na(Overtime_Pay))
# Verify cleaned rows
dim(cleaned_data)
## [1] 3146 2
ggplot(cleaned_data, aes(x = Department.Name, y = Overtime_Pay, fill = Department.Name)) +
geom_boxplot(alpha = 0.7, outlier.color = "red", outlier.shape = 1) +
labs(title = "Annual Overtime Pay Distribution by Department",
x = "Department", y = "2025 Overtime Pay ($)") +
theme_minimal() +
theme(legend.position = "none")
# Perform independent two-sample t-test assuming unequal variances (Welch's t-test)
test_results <- t.test(Overtime_Pay ~ Department.Name, data = cleaned_data, alternative = "two.sided")
print(test_results)
##
## Welch Two Sample t-test
##
## data: Overtime_Pay by Department.Name
## t = -8.1392, df = 2785.3, p-value = 5.935e-16
## alternative hypothesis: true difference in means between group Department of Police and group Fire and Rescue Services is not equal to 0
## 95 percent confidence interval:
## -9107.595 -5571.284
## sample estimates:
## mean in group Department of Police mean in group Fire and Rescue Services
## 13922.55 21261.99
Using a significance level of \(\alpha = 0.05\), the Welch two-sample t-test yielded a test statistic of \(t = -8.1392\) with \(df = 2785.3\) and a p-value of \(5.935 \times 10^{-16}\).
Our statistical analysis confirms a clear divergence in overtime compensation profiles between law enforcement and fire rescue personnel. On average, Fire and Rescue Services staff receive roughly \(\$7,339.44\) more in annual overtime pay than police department employees. This structural difference indicates that fire and rescue operations might rely on longer continuous shifts, alternate on-call cycles, or encounter unique staffing demands that require heavier overtime budget allocations compared to the police department.
The primary limitation of this analysis is that it reviews raw aggregate overtime totals without accounting for essential operational variables like employee job rank, years of tenure, base hourly rates, or unexpected emergency spikes during the fiscal year. For future research, we recommend implementing an ANOVA framework to contrast overtime pay dynamics across all county agencies, or executing a multiple linear regression model to isolate departmental effects while holding employee demographics and experience levels constant.
```