A. Introduction

Is there a significant difference in the average annual Overtime Pay earned by employees in the Department of Police compared to the Fire and Rescue Services?

This analysis utilizes the Employee Salaries dataset sourced from the Montgomery County Open Data Portal. The original dataset contains 10,632 observations tracking county workforce earnings. This study focuses specifically on two departments to compare compensation profiles:


B. Data Analysis

Chunk 1: Initial Dataset Inspection

# Load dataset
salary_data <- read.csv("Employee_Salaries_-_2025_20260421.csv")

# 2 Required EDA functions to understand the data
str(salary_data)
## 'data.frame':    10632 obs. of  8 variables:
##  $ Department         : chr  "ABS" "CUS" "DGS" "CEX" ...
##  $ Department.Name    : chr  "Alcohol Beverage Services" "Community Use of Public Facilities" "Department of General Services" "Offices of the County Executive" ...
##  $ Division           : chr  "ABS 85 IT Administration" "CUS 70 Finance and Administrative Support Team" "DGS 36 Fleet Management Services" "CEX 15 Chief Administrative Officer's Office" ...
##  $ Gender             : chr  "M" "F" "F" "F" ...
##  $ Base.Salary        : chr  "$174,641.65" "$146,133.24" "$90,149" "$207,000" ...
##  $ X2025.Overtime.Pay : chr  "$0" "$0" "$651.7" "$0" ...
##  $ X2025.Longevity.Pay: chr  "$0" "$0" "$2,884.02" "$0" ...
##  $ Grade              : chr  "M3" "M3" "16" "EX2" ...
summary(salary_data)
##   Department        Department.Name      Division            Gender         
##  Length:10632       Length:10632       Length:10632       Length:10632      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##  Base.Salary        X2025.Overtime.Pay X2025.Longevity.Pay    Grade          
##  Length:10632       Length:10632       Length:10632        Length:10632      
##  Class :character   Class :character   Class :character    Class :character  
##  Mode  :character   Mode  :character   Mode  :character    Mode  :character

Chunk 2: Data Cleaning and Preparation (dplyr)

# Clean and subset using 3 dplyr functions (filter, mutate, select)
cleaned_data <- salary_data %>%
  # 1. Filter for only the two target departments
  filter(Department.Name %in% c("Department of Police", "Fire and Rescue Services")) %>%
  
  # 2. Convert character currency ("$1,234.56") into raw clean numeric format
  mutate(Overtime_Pay = as.numeric(gsub("[\\$,]", "", X2025.Overtime.Pay))) %>%
  
  # 3. Select only the necessary columns for the analysis
  select(Department.Name, Overtime_Pay) %>%
  
  # Remove missing rows if any exist
  filter(!is.na(Overtime_Pay))

# Verify cleaned rows
dim(cleaned_data)
## [1] 3146    2

Chunk 3: Data Visualization

ggplot(cleaned_data, aes(x = Department.Name, y = Overtime_Pay, fill = Department.Name)) +
  geom_boxplot(alpha = 0.7, outlier.color = "red", outlier.shape = 1) +
  labs(title = "Annual Overtime Pay Distribution by Department",
       x = "Department", y = "2025 Overtime Pay ($)") +
  theme_minimal() +
  theme(legend.position = "none")


C. Statistical Analysis

Hypotheses

  • Null Hypothesis (\(H_0\)): \(\mu_{Police} = \mu_{Fire}\) (The population mean annual overtime pay for the Department of Police is equal to that of Fire and Rescue Services).
  • Alternative Hypothesis (\(H_a\)): \(\mu_{Police} \neq \mu_{Fire}\) (The population mean annual overtime pay is different between the two departments).

Chunk 4: Two-Sample Independent t-test

# Perform independent two-sample t-test assuming unequal variances (Welch's t-test)
test_results <- t.test(Overtime_Pay ~ Department.Name, data = cleaned_data, alternative = "two.sided")
print(test_results)
## 
##  Welch Two Sample t-test
## 
## data:  Overtime_Pay by Department.Name
## t = -8.1392, df = 2785.3, p-value = 5.935e-16
## alternative hypothesis: true difference in means between group Department of Police and group Fire and Rescue Services is not equal to 0
## 95 percent confidence interval:
##  -9107.595 -5571.284
## sample estimates:
##     mean in group Department of Police mean in group Fire and Rescue Services 
##                               13922.55                               21261.99

Interpretation of Results

Using a significance level of \(\alpha = 0.05\), the Welch two-sample t-test yielded a test statistic of \(t = -8.1392\) with \(df = 2785.3\) and a p-value of \(5.935 \times 10^{-16}\).

  • Decision: Since the p-value (\(5.935 \times 10^{-16}\)) is significantly lower than \(\alpha = 0.05\), we reject the null hypothesis (\(H_0\)).
  • Conclusion: There is a highly statistically significant difference in the average annual overtime pay between the two departments. Specifically, employees in Fire and Rescue Services earn a significantly higher average annual overtime compensation (\(\$21,261.99\)) compared to employees in the Department of Police (\(\$13,922.55\)). We are \(95\%\) confident that the true population mean difference in overtime pay between Police and Fire ranges between \(-\$9,107.60\) and \(-\$5,571.28\).

D. Conclusion and Future Directions

Summary and Implications

Our statistical analysis confirms a clear divergence in overtime compensation profiles between law enforcement and fire rescue personnel. On average, Fire and Rescue Services staff receive roughly \(\$7,339.44\) more in annual overtime pay than police department employees. This structural difference indicates that fire and rescue operations might rely on longer continuous shifts, alternate on-call cycles, or encounter unique staffing demands that require heavier overtime budget allocations compared to the police department.

Limitations and Future Directions

The primary limitation of this analysis is that it reviews raw aggregate overtime totals without accounting for essential operational variables like employee job rank, years of tenure, base hourly rates, or unexpected emergency spikes during the fiscal year. For future research, we recommend implementing an ANOVA framework to contrast overtime pay dynamics across all county agencies, or executing a multiple linear regression model to isolate departmental effects while holding employee demographics and experience levels constant.


E. References

```