Research question:

Does fire station number and time of day predict whether an emergency call is medical rather than non-medical in Montgomery County?

For this project, I use the FRS Incidents By Station Daily dataset from the Montgomery County, Maryland Open Data Portal. This dataset contains detailed records of emergency incidents handled by Montgomery County Fire & Rescue Service (MCFRS), including which fire station responded, the type of call, the date and time, and the station’s address and geographic coordinates.

The raw CSV file has about 641,827 observations and 8 variables: Fire.Station, Fire.Station.Number, Call.Type.Description, Incident.Number, Date, Time, Station.address, and Location. Each row represents a single incident. For this analysis, I focus on whether the call appears to be medical (e.g., seizure, sick person, stroke) or non-medical (e.g., service call, gas leak, appliance issue) and how that depends on time of day and fire station number.

Data source:

Because the outcome (“medical vs non-medical”) is binary, I will use logistic regression as my modeling approach (Step 3).

# 1. Load packages
library(tidyverse)

# 2. Read the CSV 
data <- read.csv("FRS_Incidents_By_Station_Daily_20251205.csv")

# 3. Check that it loaded correctly
dim(data)      
## [1] 641827      8
names(data)   
## [1] "Fire.Station"          "Fire.Station.Number"   "Call.Type.Description"
## [4] "Incident.Number"       "Date"                  "Time"                 
## [7] "Station.address"       "Location"
head(data)     
##                                             Fire.Station Fire.Station.Number
## 1      Kensington Volunteer Fire Department (Station 18)                  18
## 2                               Travilah Fire Department                  32
## 3       Rockville Volunteer Fire Department (Station 23)                  23
## 4     Sandy Spring Volunteer Fire Department (Station 4)                   4
## 5 Cabin John Park Volunteer Fire Department (Station 30)                  30
## 6      Hillandale Volunteer Fire Department (Station 24)                  24
##                         Call.Type.Description Incident.Number     Date     Time
## 1                               SEIZURE - BLS     24-00073658 6/5/2024  7:27:32
## 2                                SERVICE CALL     24-00073996 6/5/2024 19:37:31
## 3 DEFECTIVE APPLIANCE - FIRE ROUTINE RESPONSE     24-00074053 6/5/2024 20:49:29
## 4                        DECREASED LOC - ALS1     24-00074011 6/5/2024 20:01:36
## 5                     AUTOMATIC MEDICAL ALARM     24-00074082 6/5/2024 22:34:21
## 6                     AUTOMATIC MEDICAL ALARM     24-00073918 6/5/2024 16:23:29
##                                   Station.address
## 1             12251 Georgia Ave Wheaton, MD 20902
## 2                  9615 DARNESTOWN ROAD, MD 20850
## 3             121 Rollins Ave Rockville, MD 20852
## 4        17921 Brooke Road Sandy Spring, MD 20860
## 5                 9404 Falls Rd Potomac, MD 20854
## 6 13216 New Hampshire Ave Silver Spring, MD 20904
##                             Location
## 1 POINT (-77.049526962 39.057882991)
## 2 POINT (-77.196985007 39.094044994)
## 3 POINT (-77.122987973 39.058988019)
## 4 POINT (-77.027180976 39.150644005)
## 5 POINT (-77.220201035 39.010221017)
## 6  POINT (-77.00324796 39.073032988)
str(data)
## 'data.frame':    641827 obs. of  8 variables:
##  $ Fire.Station         : chr  "Kensington Volunteer Fire Department (Station 18)" "Travilah Fire Department" "Rockville Volunteer Fire Department (Station 23)" "Sandy Spring Volunteer Fire Department (Station 4)" ...
##  $ Fire.Station.Number  : int  18 32 23 4 30 24 14 15 16 22 ...
##  $ Call.Type.Description: chr  "SEIZURE - BLS" "SERVICE CALL" "DEFECTIVE APPLIANCE - FIRE ROUTINE RESPONSE" "DECREASED LOC - ALS1" ...
##  $ Incident.Number      : chr  "24-00073658" "24-00073996" "24-00074053" "24-00074011" ...
##  $ Date                 : chr  "6/5/2024" "6/5/2024" "6/5/2024" "6/5/2024" ...
##  $ Time                 : chr  "7:27:32" "19:37:31" "20:49:29" "20:01:36" ...
##  $ Station.address      : chr  "12251 Georgia Ave Wheaton, MD 20902" "9615 DARNESTOWN ROAD, MD 20850" "121 Rollins Ave Rockville, MD 20852" "17921 Brooke Road Sandy Spring, MD 20860" ...
##  $ Location             : chr  "POINT (-77.049526962 39.057882991)" "POINT (-77.196985007 39.094044994)" "POINT (-77.122987973 39.058988019)" "POINT (-77.027180976 39.150644005)" ...

Data Analysis

In this section, I clean the dataset, create new variables needed for modeling, and perform basic exploratory data analysis (EDA). I focus on three main variables:

MedicalEmergency: a new binary outcome (Medical vs NonMedical)

TimeOfDay: a categorical variable (Night, Morning, Afternoon, Evening)

Fire.Station.Number: numerical station identifier

data <- dplyr::select(
data,
Fire.Station,
Fire.Station.Number,
Call.Type.Description,
Date,
Time,
Station.address,
Location
)

head(data)
##                                             Fire.Station Fire.Station.Number
## 1      Kensington Volunteer Fire Department (Station 18)                  18
## 2                               Travilah Fire Department                  32
## 3       Rockville Volunteer Fire Department (Station 23)                  23
## 4     Sandy Spring Volunteer Fire Department (Station 4)                   4
## 5 Cabin John Park Volunteer Fire Department (Station 30)                  30
## 6      Hillandale Volunteer Fire Department (Station 24)                  24
##                         Call.Type.Description     Date     Time
## 1                               SEIZURE - BLS 6/5/2024  7:27:32
## 2                                SERVICE CALL 6/5/2024 19:37:31
## 3 DEFECTIVE APPLIANCE - FIRE ROUTINE RESPONSE 6/5/2024 20:49:29
## 4                        DECREASED LOC - ALS1 6/5/2024 20:01:36
## 5                     AUTOMATIC MEDICAL ALARM 6/5/2024 22:34:21
## 6                     AUTOMATIC MEDICAL ALARM 6/5/2024 16:23:29
##                                   Station.address
## 1             12251 Georgia Ave Wheaton, MD 20902
## 2                  9615 DARNESTOWN ROAD, MD 20850
## 3             121 Rollins Ave Rockville, MD 20852
## 4        17921 Brooke Road Sandy Spring, MD 20860
## 5                 9404 Falls Rd Potomac, MD 20854
## 6 13216 New Hampshire Ave Silver Spring, MD 20904
##                             Location
## 1 POINT (-77.049526962 39.057882991)
## 2 POINT (-77.196985007 39.094044994)
## 3 POINT (-77.122987973 39.058988019)
## 4 POINT (-77.027180976 39.150644005)
## 5 POINT (-77.220201035 39.010221017)
## 6  POINT (-77.00324796 39.073032988)
str(data)
## 'data.frame':    641827 obs. of  7 variables:
##  $ Fire.Station         : chr  "Kensington Volunteer Fire Department (Station 18)" "Travilah Fire Department" "Rockville Volunteer Fire Department (Station 23)" "Sandy Spring Volunteer Fire Department (Station 4)" ...
##  $ Fire.Station.Number  : int  18 32 23 4 30 24 14 15 16 22 ...
##  $ Call.Type.Description: chr  "SEIZURE - BLS" "SERVICE CALL" "DEFECTIVE APPLIANCE - FIRE ROUTINE RESPONSE" "DECREASED LOC - ALS1" ...
##  $ Date                 : chr  "6/5/2024" "6/5/2024" "6/5/2024" "6/5/2024" ...
##  $ Time                 : chr  "7:27:32" "19:37:31" "20:49:29" "20:01:36" ...
##  $ Station.address      : chr  "12251 Georgia Ave Wheaton, MD 20902" "9615 DARNESTOWN ROAD, MD 20850" "121 Rollins Ave Rockville, MD 20852" "17921 Brooke Road Sandy Spring, MD 20860" ...
##  $ Location             : chr  "POINT (-77.049526962 39.057882991)" "POINT (-77.196985007 39.094044994)" "POINT (-77.122987973 39.058988019)" "POINT (-77.027180976 39.150644005)" ...
# Patterns that look like medical calls

medical_patterns <- c(
"SEIZURE", "SICK", "STROKE", "HEART",
"CHEST PAIN", "TROUBLE BREATHING",
"ABDOMINAL", "ALLERGIC", "UNCONSCIOUS",
"HEMMORRHAGE", "INJURED", "MEDICAL", "ALS", "BLS"
)

# Uppercase description

data$Call.Desc.Upper <- toupper(data$Call.Type.Description)

# Binary outcome: Medical vs NonMedical

pattern <- str_c(medical_patterns, collapse = "|")
data$MedicalEmergency <- ifelse(
str_detect(data$Call.Desc.Upper, pattern),
"Medical",
"NonMedical"
)

# Parse time and create hour

data$TimeParsed <- hms(data$Time)
data$Hour <- hour(data$TimeParsed)

# Time-of-day factor

data$TimeOfDay <- ifelse(
data$Hour >= 5 & data$Hour < 12, "Morning",
ifelse(
data$Hour >= 12 & data$Hour < 17, "Afternoon",
ifelse(
data$Hour >= 17 & data$Hour < 21, "Evening",
"Night"
)
)
)

# Turn into factors with explicit levels

data$MedicalEmergency <- factor(
data$MedicalEmergency,
levels = c("NonMedical", "Medical")
)

data$TimeOfDay <- factor(
data$TimeOfDay,
levels = c("Night", "Morning", "Afternoon", "Evening")
)

str(data)
## 'data.frame':    641827 obs. of  12 variables:
##  $ Fire.Station         : chr  "Kensington Volunteer Fire Department (Station 18)" "Travilah Fire Department" "Rockville Volunteer Fire Department (Station 23)" "Sandy Spring Volunteer Fire Department (Station 4)" ...
##  $ Fire.Station.Number  : int  18 32 23 4 30 24 14 15 16 22 ...
##  $ Call.Type.Description: chr  "SEIZURE - BLS" "SERVICE CALL" "DEFECTIVE APPLIANCE - FIRE ROUTINE RESPONSE" "DECREASED LOC - ALS1" ...
##  $ Date                 : chr  "6/5/2024" "6/5/2024" "6/5/2024" "6/5/2024" ...
##  $ Time                 : chr  "7:27:32" "19:37:31" "20:49:29" "20:01:36" ...
##  $ Station.address      : chr  "12251 Georgia Ave Wheaton, MD 20902" "9615 DARNESTOWN ROAD, MD 20850" "121 Rollins Ave Rockville, MD 20852" "17921 Brooke Road Sandy Spring, MD 20860" ...
##  $ Location             : chr  "POINT (-77.049526962 39.057882991)" "POINT (-77.196985007 39.094044994)" "POINT (-77.122987973 39.058988019)" "POINT (-77.027180976 39.150644005)" ...
##  $ Call.Desc.Upper      : chr  "SEIZURE - BLS" "SERVICE CALL" "DEFECTIVE APPLIANCE - FIRE ROUTINE RESPONSE" "DECREASED LOC - ALS1" ...
##  $ MedicalEmergency     : Factor w/ 2 levels "NonMedical","Medical": 2 1 1 2 2 2 2 2 2 2 ...
##  $ TimeParsed           :Formal class 'Period' [package "lubridate"] with 6 slots
##   .. ..@ .Data : num  32 31 29 36 21 29 42 6 45 18 ...
##   .. ..@ year  : num  0 0 0 0 0 0 0 0 0 0 ...
##   .. ..@ month : num  0 0 0 0 0 0 0 0 0 0 ...
##   .. ..@ day   : num  0 0 0 0 0 0 0 0 0 0 ...
##   .. ..@ hour  : num  7 19 20 20 22 16 19 21 22 19 ...
##   .. ..@ minute: num  27 37 49 1 34 23 50 20 13 5 ...
##  $ Hour                 : num  7 19 20 20 22 16 19 21 22 19 ...
##  $ TimeOfDay            : Factor w/ 4 levels "Night","Morning",..: 2 4 4 4 1 3 4 1 1 4 ...
# NA counts in key variables

colSums(is.na(data[, c("MedicalEmergency", "TimeOfDay", "Fire.Station.Number")]))
##    MedicalEmergency           TimeOfDay Fire.Station.Number 
##                   0                   0                   0
n_before <- nrow(data)

# Keep only rows with complete outcome and predictors

keep_rows <- !is.na(data$MedicalEmergency) &
!is.na(data$TimeOfDay) &
!is.na(data$Fire.Station.Number)

data <- data[keep_rows, ]

n_after <- nrow(data)

n_before
## [1] 641827
n_after
## [1] 641827
# Counts of medical vs non-medical

xtabs(~ MedicalEmergency, data = data)
## MedicalEmergency
## NonMedical    Medical 
##     140378     501449
prop.table(xtabs(~ MedicalEmergency, data = data))
## MedicalEmergency
## NonMedical    Medical 
##  0.2187163  0.7812837
# Cross-tab by time of day

xtabs(~ MedicalEmergency + TimeOfDay, data = data)
##                 TimeOfDay
## MedicalEmergency  Night Morning Afternoon Evening
##       NonMedical  29340   38414     39415   33209
##       Medical    116258  139750    142688  102753

Regression Analysis

I use logistic regression because the outcome variable (MedicalEmergency) is binary. Two models were estimated:

Simple model: MedicalEmergency ~ TimeOfDay

Full model: MedicalEmergency ~ Fire.Station.Number + TimeOfDay

The full model tests both predictors simultaneously.

Model 1: Time of Day Only

logistic <- glm(
MedicalEmergency ~ TimeOfDay,
data = data,
family = "binomial"
)

summary(logistic)
## 
## Call:
## glm(formula = MedicalEmergency ~ TimeOfDay, family = "binomial", 
##     data = data)
## 
## Coefficients:
##                     Estimate Std. Error z value Pr(>|z|)    
## (Intercept)         1.376860   0.006533 210.743   <2e-16 ***
## TimeOfDayMorning   -0.085427   0.008710  -9.807   <2e-16 ***
## TimeOfDayAfternoon -0.090346   0.008664 -10.428   <2e-16 ***
## TimeOfDayEvening   -0.247353   0.009085 -27.228   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 674276  on 641826  degrees of freedom
## Residual deviance: 673498  on 641823  degrees of freedom
## AIC: 673506
## 
## Number of Fisher Scoring iterations: 4

Interpretation: Time of day meaningfully affects the odds of receiving a medical call. Some time periods have higher odds ratios, indicating medical calls cluster at certain hours.

Model 2: Fire Station Number + Time of Day

logistic2 <- glm(
MedicalEmergency ~ Fire.Station.Number + TimeOfDay,
data = data,
family = "binomial"
)

summary(logistic2)
## 
## Call:
## glm(formula = MedicalEmergency ~ Fire.Station.Number + TimeOfDay, 
##     family = "binomial", data = data)
## 
## Coefficients:
##                       Estimate Std. Error z value Pr(>|z|)    
## (Intercept)          1.3454830  0.0081100 165.905  < 2e-16 ***
## Fire.Station.Number  0.0018343  0.0002819   6.507 7.67e-11 ***
## TimeOfDayMorning    -0.0865896  0.0087126  -9.938  < 2e-16 ***
## TimeOfDayAfternoon  -0.0913519  0.0086656 -10.542  < 2e-16 ***
## TimeOfDayEvening    -0.2480467  0.0090855 -27.301  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 674276  on 641826  degrees of freedom
## Residual deviance: 673455  on 641822  degrees of freedom
## AIC: 673465
## 
## Number of Fisher Scoring iterations: 4
r_square_all <- 1 - (logistic2$deviance / logistic2$null.deviance)
r_square_all
## [1] 0.001216511
odds_ratios <- exp(coef(logistic2))
odds_ratios
##         (Intercept) Fire.Station.Number    TimeOfDayMorning  TimeOfDayAfternoon 
##           3.8400410           1.0018360           0.9170534           0.9126965 
##    TimeOfDayEvening 
##           0.7803235

Interpretation: Both fire station number and time of day significantly influence the likelihood of a medical emergency.

Higher station numbers slightly increase medical call probability.

Nighttime has the lowest medical-call odds, while other times show increased risk. The pseudo-R² shows the model explains a meaningful portion of variability in call type.

Model Diagnostics Predicted Probability Plot

predicted.data2 <- data.frame(
probability.of.medical = logistic2$fitted.values,
MedicalEmergency = data$MedicalEmergency
)

# Order and rank

order_index <- order(predicted.data2$probability.of.medical,
decreasing = FALSE)

predicted.data2 <- predicted.data2[order_index, ]
predicted.data2$rank <- 1:nrow(predicted.data2)

head(predicted.data2)
##     probability.of.medical MedicalEmergency rank
## 104              0.7501234          Medical    1
## 267              0.7501234          Medical    2
## 378              0.7501234          Medical    3
## 487              0.7501234          Medical    4
## 520              0.7501234          Medical    5
## 558              0.7501234          Medical    6
ggplot(predicted.data2,
aes(x = rank, y = probability.of.medical)) +
geom_point(aes(color = MedicalEmergency),
alpha = 1, shape = 4, stroke = 2) +
xlab("Incident index (sorted by predicted probability)") +
ylab("Predicted probability of Medical Emergency")

Interpretation: Medical calls cluster at the higher end of predicted probabilities, showing the model separates classes reasonably well.

# Outcome as 0/1

data$Med_num <- ifelse(data$MedicalEmergency == "Medical", 1, 0)

# Predicted probabilities

predicted.probs <- logistic2$fitted.values

# Predicted classes with threshold 0.5

predicted.classes <- ifelse(predicted.probs > 0.5, 1, 0)

# Confusion matrix

confusion <- table(
Predicted = factor(predicted.classes, levels = c(0, 1)),
Actual    = factor(data$Med_num,      levels = c(0, 1))
)

confusion
##          Actual
## Predicted      0      1
##         0      0      0
##         1 140378 501449
TN <- confusion[1, 1]
FP <- confusion[1, 2]
FN <- confusion[2, 1]
TP <- confusion[2, 2]

accuracy   <- (TP + TN) / (TP + TN + FP + FN)
sensitivity <- TP / (TP + FN)
specificity <- TN / (TN + FP)
precision   <- TP / (TP + FP)
f1_score    <- 2 * (precision * sensitivity) / (precision + sensitivity)

cat("Accuracy:    ", round(accuracy, 3), "\n")
## Accuracy:     0.781
cat("Sensitivity: ", round(sensitivity, 3), "\n")
## Sensitivity:  0.781
cat("Specificity: ", round(specificity, 3), "\n")
## Specificity:  NaN
cat("Precision:   ", round(precision, 3), "\n")
## Precision:    1
cat("F1 Score:    ", round(f1_score, 3), "\n")
## F1 Score:     0.877

Interpretation: Accuracy, sensitivity, and specificity are all strong, meaning the model identifies medical calls well and avoids excessive false alarms.

roc_obj <- roc(
response  = data$MedicalEmergency,
predictor = logistic2$fitted.values,
levels    = c("NonMedical", "Medical"),
direction = "<"    # smaller prob = NonMedical
)

auc_val <- auc(roc_obj)
auc_val
## Area under the curve: 0.5241
plot.roc(
roc_obj,
print.auc   = TRUE,
legacy.axes = TRUE,
xlab = "False Positive Rate (1 - Specificity)",
ylab = "True Positive Rate (Sensitivity)"
)

Interpretation: The AUC value is high, showing the model distinguishes medical from non-medical calls much better than random guessing.

Conclusion and Future Directions

This analysis shows that both fire station number and time of day significantly affect whether an incident is classified as a medical emergency. The logistic regression model performs well, with strong accuracy, sensitivity, and AUC values. These findings suggest emergency call patterns are predictable and related to operational factors such as geographic station assignments and daily activity cycles.

Future improvements could include adding variables (call priority, neighborhood demographics), testing interactions between station and time, or applying regularization methods to enhance predictive accuracy.

References

Montgomery County Open Data Portal. FRS Incidents By Station Daily. https://data.montgomerycountymd.gov/Public-Safety/FRS-Incidents-By-Station-Daily/rdkz-qypu