California Wildfires

Author

Robert Gravatt - Data 110

Satellite photograph of the Camp Fire on November 8, 2018 at 10:45 am near Paradise, California 90 miles North of Sacramento. When the fire was first spotted earlier that day at 6:25 am, it was reported to have been about a hundred feet by a hundred feet [4].

Introduction

Between 2013 and 2019, California experienced a series of devastating wildfires that reshaped the state’s landscape and underscored the urgency of understanding fire behavior. This project was designed to analyze and visualize wildfires of a hundred or more than acres during that period. California’s climate and geography make it uniquely vulnerable to catastrophic fires. Hotter summers, prolonged droughts, and strong winds have increased the risk of fire. The Rim Fire of 2013 burned more than 257,000 acres in Tuolumne County, including parts of Yosemite National Park, and remains one of the largest fires in California history [1]. In 2017, the Tubbs Fire swept through Sonoma County, killing twenty‑two people and destroying over five thousand structures, while the Thomas Fire consumed nearly 282,000 acres across Ventura and Santa Barbara counties [2][3]. The Camp Fire of 2018 ignited by failed PG&E power lines was the deadliest and most destructive wildfire the state has ever seen, claiming eighty‑five lives and destroying more than twelve thousand structures in Butte County [4]. While the Camp Fire is remembered for its devastating human toll, the Ranch Fire of 2018 is noted for its environmental destruction of 410,203 acres, the record at the time [5]. The toll that these fires have taken on people’s lives is immeasurable.

The dataset chosen for this study “California Wildfires (2013–2020)” was compiled from CAL FIRE’s official incident reports and federal records from the National Interagency Fire Center. It was cleaned and standardized into a single research‑ready CSV file with latitudes and longitudes. It provides accessible statewide data on acreage burned, resources deployed, fatalities, and structures impacted without requiring direct use of the much larger raw CAL FIRE DINS datasets which are tens of MB. It is available on Kaggle as well as OpenBayData under the name: Historic California Wildfires Data (2013-2020) [6].

This topic was chosen because California sits at the leading edge of climate change’s destructive environmental and economic impacts. The state has endured some of the largest and deadliest wildfires in U.S. history, driven by drought and building development in forested areas. Studying California’s wildfire data provides a illustrative case study of how climate change amplifies natural hazards which can threaten communities, ecosystems, and infrastructure. By analyzing these incidents, this project underscores the scale of recent disasters and the importance of fire preparedness and prevention in a changing climate.

References

[1] “Rim Fire.” Wikipedia, Wikimedia Foundation, last updated 2025. https://en.wikipedia.org/wiki/Rim_Fire

[2] “Tubbs Fire.” Wikipedia, Wikimedia Foundation, last updated 2025. https://en.wikipedia.org/wiki/Tubbs_Fire

[3] “Thomas Fire.” Wikipedia, Wikimedia Foundation, last updated 2025. https://en.wikipedia.org/wiki/Thomas_Fire

[4] “Camp Fire (2018).” Wikipedia, Wikimedia Foundation, last updated 2025. https://en.wikipedia.org/wiki/Camp_Fire_(2018)

[5] “Mendocino Complex Fire.” Wikipedia, Wikimedia Foundation, last updated 2025. https://en.wikipedia.org/wiki/Mendocino_Complex_Fire

[6] “Historic California Wildfires Data (2013–2020).” Kaggle, uploaded by OpenBayData, 2020. https://www.kaggle.com/datasets/ananthu017/california-wildfires

Load the data

library(tidyverse)

rawData <- read_csv("California_Fire_Incidents.csv")

Filter out fires less than 100 acres and for duplicate entries

# Deduplicate and filter fires >= 100 acres
fires_major <- rawData |>
  unique() |>                                # remove exact duplicate rows
  subset(AcresBurned >= 100)                   # keep only fires at least 100 acres

# Check row counts before and after
nrow(rawData)        # original count
[1] 1636
nrow(fires_major)    # after cleaning and filtering
[1] 820

Faceted bar graphs of resources deployed in major fires

library(RColorBrewer)

# Aggregate statewide totals by year
fires_resources <- fires_major |>
  group_by(ArchiveYear) |>
  summarize(
    Personnel   = sum(PersonnelInvolved, na.rm = TRUE),
    Engines     = sum(Engines, na.rm = TRUE),
    Helicopters = sum(Helicopters, na.rm = TRUE),
    Dozers      = sum(Dozers, na.rm = TRUE)
  ) |>
  pivot_longer(
    cols = c(Personnel, Engines, Helicopters, Dozers),
    names_to = "ResourceType",
    values_to = "ResourceCount"
  ) |>
  mutate(ResourceType = factor(ResourceType,
                               levels = c("Personnel", "Engines", "Helicopters", "Dozers")))

# Faceted bar chart with all years labeled
ggplot(fires_resources, aes(x = ArchiveYear, y = ResourceCount, fill = ResourceType)) +
  geom_col() +
  scale_fill_brewer(palette = "Set2") +
  scale_x_continuous(breaks = seq(2013, 2019, 1)) +   # show all years and not just evens
  labs(
    title = "California Wildfires (2013–2019): Statewide Suppression Resources",
    subtitle = "Total Personnel, Engines, Helicopters, and Dozers deployed per year",
    x = "Year of Fire",
    y = "Total Resources Deployed",
    fill = "Resource Type",
    caption = "Data Source: CAL FIRE Incident Records, 2013–2019"
  ) +
  facet_wrap(~ResourceType, scales = "free_y") +
  theme_minimal() +
  theme(legend.position = "bottom")

Interactive Map of Fires by Geolocation

library(leaflet)

# Replace NA fatalities with 0
fires_major$Fatalities[is.na(fires_major$Fatalities)] <- 0

# Weighted square root scale for marker size
fires_major$SizeFatal <- sqrt(fires_major$Fatalities) * 0.8  # adjust multiplier for visibility

# Build tooltip text (horizontal)
fires_major$tooltip <- paste(
  fires_major$Name,
  "| Year:", fires_major$ArchiveYear,
  "| Acres:", fires_major$AcresBurned,
  "| Fatalities:", fires_major$Fatalities
)

# Define color palette by AcresBurned
pal <- colorNumeric(
  palette = c("gold", "darkorange", "red"),
  domain = fires_major$AcresBurned
)

# Interactive map 
leaflet(fires_major) |>
  addProviderTiles("Esri.WorldImagery") |>
  addCircleMarkers(
    lng = ~Longitude, lat = ~Latitude,
    radius = ~pmin(0.3 + SizeFatal, 10),   # cap max size for readability
    color = ~pal(AcresBurned),
    fillOpacity = 0.7,
    popup = ~tooltip,
    label = ~tooltip,
    labelOptions = labelOptions(direction = "auto")
  ) |>
  addLegend("bottomright",
            pal = pal,
            values = ~AcresBurned,
            title = "Acres Burned",
            labFormat = labelFormat(big.mark = ","),
            opacity = 1) |>
  setView(lng = -119.5, lat = 37.5, zoom = 6)

Summary of Visualizations

The first chart is a faceted bar chart created with ggplot that shows statewide totals of personnel, fire engines, helicopters, and bulldozers deployed to fight California wildfires from 2013 to 2019. Faceting each resource into its own panel made it easy to compare deployment patterns over time, highlighting that personnel consistently dominated suppression efforts while fire engines, helicopters, and bulldozers fluctuated year to year, with notable spikes during the severe fire seasons of 2013 and 2018.

The second visualization is an interactive Leaflet map showing the geographic distribution of major California fires. Circle markers were scaled by a weighted square root of fatalities, and colored with a gradient ranging from gold through dark orange to red to emphasize acres burned. Tooltips provided each fire’s name, year, acreage, and fatalities, while the Esri World Imagery basemap added terrain and land‑cover context. The largest marker diameter corresponds to the deadly Camp Fire of 2018 that incinerated the town of Paradise.

Coding Bibliography

[1] R CHARTS. “Interactive Maps with Leaflet in R.” n.d. Accessed 13 Nov. 2025. https://r-charts.com/spatial/interactive-maps-leaflet/

[2] The R Graph Gallery. “Interactive Maps with leaflet in R.” n.d. Accessed 13 Nov. 2025. https://r-graph-gallery.com/package/leaflet.html

[3] Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. R for Data Science, 2e. O’Reilly Media, 2024.