Assignment 2: California Fire Hotspots

Author

Tony Fraser and Mark Gonsalves

Published

February 11, 2025

github

Datasource overview

  • This is the CAL FIRE Damage Inspection Program (DINS) database of structures damaged or destroyed by wildland fires in California since 2013, as documented by CAL FIRE and partnering agencies. Structures damaged before 2013 do not have a digital record. Fires in LRA (Local Responsibility Area) or FRA (Federal Responsibility Area) responsibility areas may or may not be included.
  • Starting in 2018, the DINS program began collecting data on all structures (damaged and non-damaged). Before 2018, only damaged/destroyed structures were recorded.
  • This database includes structures impacted by wildland fire that are inside or within 100 meters of the fire perimeter. Structure type, construction features, and defensible space attributes are determined as accurately as possible, even when the structure is completely destroyed. Some attributes may be missing if they could not be determined.
  • Fire damage and poor access can limit inspections. While all inspections follow a systematic process, some impacted structures may not be identified, leading to a small margin of error.
  • The database contains two address fields:
    • Field-determined address:Street number, street name, and street type, entered by the inspector based on on-site observations.
    • Parcel-based address: Address (parcel) and APN (parcel), added througha spatial join after data collection.

Datasource statistics

  • Rows: 130,717
  • Columns: 46
  • Update Frequency: Monthly or after a big fire
  • Coverage: 2013 to
  • Last Update: February 11, 2025, 4:07 PM (UTC-08:00)
  • URL: https://data.ca.gov/dataset/cal-fire-damage-inspection-dins-data

Load and preview

import os
import pandas as pd
import numpy as np
import networkx as nx
import pickle
from scipy.spatial import cKDTree
from geopy.distance import geodesic
from data620.helpers.glimpse import glimpse
from data620.helpers.dins_utils import clean_column_names

dins =  pd.read_csv("https://tonyfraser-data.s3.us-east-1.amazonaws.com/calfire/raw/POSTFIRE_MASTER_DATA_SHARE_2064760709534146017.csv")

Preview using glimpse

glimpse(dins)
Rows: 130721
Columns: 46

Column preview:
--------------------------------------------------------------------------------
OBJECTID                  <int64> 1, 2, 3, 4, 5
* Damage                  <object> No Damage, Affected (1-9%), No Damage, No Damage, No Damage
* Street Number           <float64> 8376.0, 8402.0, 8430.0, 3838.0, 3830.0
* Street Name             <object> Quail Canyon, Quail Canyon, Quail Canyon , Putah Creek, Putah Creek
* Street Type (e.g. road, drive, lane, etc.) <object> Road, Road, Road, Road, Road
Street Suffix (e.g. apt. 23, blding C) <object> None, None, None, None, None
* City                    <object> Winters, Winters, Winters, Winters, Winters
State                     <object> CA, CA, CA, CA, CA
Zip Code                  <float64> None, None, None, None, None
* CAL FIRE Unit           <object> LNU, LNU, LNU, LNU, LNU
County                    <object> Solano, Solano, Solano, Solano, Solano
Community                 <object> None, None, None, None, None
Battalion                 <object> 8.0, None, None, None, None
* Incident Name           <object> Quail, Quail, Quail, Quail, Quail
Incident Number (e.g. CAAEU 123456) <object> CALNU 008419, CALNU 008419, CALNU 008419, CALNU 008419, CALNU 008419
Incident Start Date       <object> 6/6/2020 12:00:00 AM, 6/6/2020 12:00:00 AM, 6/6/2020 12:00:00 AM, 6/6/2020 12:00:00 AM, 6/6/2020 12:00:00 AM
Hazard Type               <object> Fire, Fire, Fire, Fire, Fire
If Affected 1-9% - Where did fire start? <object> None, Deck on Grade, None, None, None
If Affected 1-9% - What started fire? <object> None, Unknown, None, None, None
Structure Defense Actions Taken <object> None, Hand Crew Fuel Break, None, None, None
* Structure Type          <object> Single Family Residence Multi Story, Single Family Residence Single Story, Single Family Residence Single Story, Single Family Residence Single Story, Single Family Residence Single Story
Structure Category        <object> Single Residence, Single Residence, Single Residence, Single Residence, Single Residence
# Units in Structure (if multi unit) <float64> 1.0, None, None, None, None
# of Damaged Outbuildings < 120 SQFT <float64> None, None, None, None, None
# of Non Damaged Outbuildings < 120 SQFT <float64> None, None, None, None, None
* Roof Construction       <object> Asphalt, Asphalt, Asphalt, Asphalt, Tile
* Eaves                   <object> Unenclosed, Unenclosed, Enclosed, Unenclosed, Enclosed
* Vent Screen             <object> Mesh Screen <= 1/8"", Mesh Screen <= 1/8"", Mesh Screen > 1/8"", Mesh Screen > 1/8"", Mesh Screen > 1/8""
* Exterior Siding         <object> Wood, Wood, Wood, Wood, Wood
* Window Pane             <object> Single Pane, Multi Pane, Single Pane, Single Pane, Multi Pane
* Deck/Porch On Grade     <object> Wood, Masonry/Concrete, No Deck/Porch, No Deck/Porch, Wood
* Deck/Porch Elevated     <object> Wood, No Deck/Porch, No Deck/Porch, No Deck/Porch, Wood
* Patio Cover/Carport Attached to Structure <object> No Patio Cover/Carport, No Patio Cover/Carport, No Patio Cover/Carport, Combustible, Combustible
* Fence Attached to Structure <object> No Fence, Combustible, No Fence, No Fence, No Fence
Distance - Propane Tank to Structure <object> None, None, None, None, None
Distance - Residence to Utility/Misc Structure &gt; 120 SQFT <object> None, None, None, None, None
Fire Name (Secondary)     <object> Quail, Quail, Quail, Quail, Quail
APN (parcel)              <object> 0101090290, 0101090270, 0101090310, 0103010240, 0103010220
Assessed Improved Value (parcel) <float64> 510000.0, 573052.0, 350151.0, 134880.0, 346648.0
Year Built (parcel)       <float64> 1997.0, 1980.0, 2004.0, 1981.0, 1980.0
Site Address (parcel)     <object> 8376 QUAIL CANYON RD VACAVILLE CA 95688, 8402 QUAIL CANYON RD VACAVILLE CA 95688, 8430 QUAIL CANYON RD VACAVILLE CA 95688, 3838 PUTAH CREEK RD WINTERS CA 95694, 3830 PUTAH CREEK RD WINTERS CA 95694
GLOBALID                  <object> e1919a06-b4c6-476d-99e5-f0b45b070de8, b090eeb6-5b18-421e-9723-af7c9144587c, 268da70b-753f-46aa-8fb1-327099337395, 64d4a278-5ee9-414a-8bf4-247c5b5c60f9, 1b44b214-01fd-4f06-b764-eb42a1ec93d7
Latitude                  <float64> 38.4749601817272, 38.4774416286387, 38.4793575002602, 38.4873131633319, 38.4856356155902
Longitude                 <float64> -122.044464987985, -122.04325235398, -122.044584558241, -122.015115438533, -122.016122281982
x                         <float64> -13585927.6966, -13585792.7068, -13585941.0071, -13582660.5197, -13582772.601
y                         <float64> 4646740.75, 4647093.5986, 4647366.0337, 4648497.3988, 4648258.8264

Preview using df.head()

(you can scroll right)

dins.head()
OBJECTID * Damage * Street Number * Street Name * Street Type (e.g. road, drive, lane, etc.) Street Suffix (e.g. apt. 23, blding C) * City State Zip Code * CAL FIRE Unit ... Fire Name (Secondary) APN (parcel) Assessed Improved Value (parcel) Year Built (parcel) Site Address (parcel) GLOBALID Latitude Longitude x y
0 1 No Damage 8376.0 Quail Canyon Road NaN Winters CA NaN LNU ... Quail 0101090290 510000.0 1997.0 8376 QUAIL CANYON RD VACAVILLE CA 95688 e1919a06-b4c6-476d-99e5-f0b45b070de8 38.474960 -122.044465 -1.358593e+07 4.646741e+06
1 2 Affected (1-9%) 8402.0 Quail Canyon Road NaN Winters CA NaN LNU ... Quail 0101090270 573052.0 1980.0 8402 QUAIL CANYON RD VACAVILLE CA 95688 b090eeb6-5b18-421e-9723-af7c9144587c 38.477442 -122.043252 -1.358579e+07 4.647094e+06
2 3 No Damage 8430.0 Quail Canyon Road NaN Winters CA NaN LNU ... Quail 0101090310 350151.0 2004.0 8430 QUAIL CANYON RD VACAVILLE CA 95688 268da70b-753f-46aa-8fb1-327099337395 38.479358 -122.044585 -1.358594e+07 4.647366e+06
3 4 No Damage 3838.0 Putah Creek Road NaN Winters CA NaN LNU ... Quail 0103010240 134880.0 1981.0 3838 PUTAH CREEK RD WINTERS CA 95694 64d4a278-5ee9-414a-8bf4-247c5b5c60f9 38.487313 -122.015115 -1.358266e+07 4.648497e+06
4 5 No Damage 3830.0 Putah Creek Road NaN Winters CA NaN LNU ... Quail 0103010220 346648.0 1980.0 3830 PUTAH CREEK RD WINTERS CA 95694 1b44b214-01fd-4f06-b764-eb42a1ec93d7 38.485636 -122.016122 -1.358277e+07 4.648259e+06

5 rows × 46 columns

Set up the graph

Each property is a node, and nodes are connected if they are within 500 meters of each other. Like our actor analogy from class, a highly connected house is one with many nearby houses that were also inspected, potentially indicating a fire-affected area. This network will highlight densely packed neighborhoods rather than truly fire-damaged zones, but let’s see what patterns emerge.

This took a ridiculous amount of time to run, so we cached it in a pickle file.

class FireDamageGraph:
    """Singleton class for loading, building, and caching a fire damage network graph."""
    CACHE_FILE = "nogit_fire_graph.gpickle"
    INSTANCE = None
    
    def __new__(cls):
        if cls.INSTANCE is None:
            cls.INSTANCE = super(FireDamageGraph, cls).__new__(cls)
            cls.INSTANCE.graph = None
            cls.INSTANCE.df = None
            cls.INSTANCE.get_graph()
        return cls.INSTANCE
    
    def get_graph(self):
        """Loads the graph from cache if available; otherwise, builds and caches it."""
        if os.path.exists(self.CACHE_FILE):
            print("Loading cached graph...")
            with open(self.CACHE_FILE, 'rb') as f:
                self.graph = pickle.load(f)
        else:
            self.build_graph()
        return self.graph
    
    def build_graph(self):
        """Builds the fire damage graph from the dataset and caches it."""
        try:
            print("Building graph from scratch...")
            
            # Load and clean data
            self.df = (
                dins
                .pipe(clean_column_names)
                .assign(
                    latitude=lambda x: x["latitude"].round(6),
                    longitude=lambda x: x["longitude"].round(6)
                )
            )
            print(f"Loaded {len(self.df)} rows of data")

            G = nx.Graph()
            
            print("Creating nodes...")
            node_data = {
                row["object_id"]: {
                    "latitude": row["latitude"],
                    "longitude": row["longitude"],
                    "damage": row["damage"],
                    "structure_type": row["structure_type"],
                    "incident": row["incident_name"]
                }
                for _, row in self.df.iterrows()
            }
            G.add_nodes_from(node_data.items())
            print(f"Added {len(node_data)} nodes")
            
            print("Building KD-tree...")
            coords = np.array([[data["latitude"], data["longitude"]] 
                             for data in node_data.values()])
            tree = cKDTree(coords)
            
            distance_threshold = 500 / 111000
            print("Finding pairs within threshold...")
            pairs = tree.query_pairs(distance_threshold, output_type='ndarray')
            print(f"Found {len(pairs)} pairs within threshold")
            
            print("Adding edges...")
            node_ids = list(node_data.keys())
            edges = [
                (node_ids[i], node_ids[j], 
                 {"weight": geodesic(
                     (coords[i][0], coords[i][1]), 
                     (coords[j][0], coords[j][1])
                 ).meters})
                for i, j in pairs
            ]
            G.add_edges_from(edges)
            print(f"Added {len(edges)} edges")
            
            # Save graph to cache using pickle
            print("Saving graph to cache...")
            with open(self.CACHE_FILE, 'wb') as f:
                pickle.dump(G, f)
            print("Graph saved to cache.")
            
            self.graph = G
            
        except Exception as e:
            print(f"Error building graph: {str(e)}")
            raise
# Create instance
fire_graph = FireDamageGraph()

# Use the stored DataFrame for the centrality calculation
df = fire_graph.df
G = fire_graph.graph

print(f"Graph has {G.number_of_nodes()} nodes and {G.number_of_edges()} edges")
Building graph from scratch...
Loaded 130721 rows of data
Creating nodes...
Added 130721 nodes
Building KD-tree...
Finding pairs within threshold...
Found 14070796 pairs within threshold
Adding edges...
Added 14070796 edges
Saving graph to cache...
Graph saved to cache.
Graph has 130721 nodes and 14070796 edges

Show the top 30 hotspots

degree_centrality = nx.degree_centrality(G)

# Convert to a DataFrame
top_nodes = pd.DataFrame(degree_centrality.items(), columns=["object_id", "centrality"])

# Merge with fire data to get incident name, damage type, and address
top_nodes = top_nodes.merge(
    df[["object_id", "incident_name", "damage", "latitude", "longitude", "site_address"]], 
    on="object_id"
)

# Sort by centrality and show the top 30
top_30 = top_nodes.sort_values(by="centrality", ascending=False).head(30)
top_30
object_id centrality incident_name damage latitude longitude site_address
103110 104102 0.010656 Eaton No Damage 34.194601 -118.152900 248 W TERRACE ST, ALTADENA, CA 91001
103717 104709 0.010656 Eaton No Damage 34.194320 -118.152612 223 W MARIPOSA ST, ALTADENA, CA 91001
104063 105055 0.010656 Eaton No Damage 34.194136 -118.152626 223 W MARIPOSA ST, ALTADENA, CA 91001
102612 103604 0.010656 Eaton No Damage 34.194476 -118.152452 224 W TERRACE ST, ALTADENA, CA 91001
104309 105301 0.010649 Eaton Destroyed (>50%) 34.193634 -118.153116 232 W MARIPOSA ST, ALTADENA, CA 91001
103954 104946 0.010649 Eaton No Damage 34.194660 -118.152550 234 W TERRACE ST, ALTADENA, CA 91001
106581 107573 0.010649 Eaton Destroyed (>50%) 34.193820 -118.153324 246 W MARIPOSA ST, ALTADENA, CA 91001
102253 103245 0.010649 Eaton No Damage 34.194518 -118.152606 234 W TERRACE ST, ALTADENA, CA 91001
107459 108451 0.010641 Eaton Destroyed (>50%) 34.193591 -118.152692 214 W MARIPOSA ST, ALTADENA, CA 91001
103277 104269 0.010641 Eaton No Damage 34.194561 -118.152743 240 W TERRACE ST, ALTADENA, CA 91001
102427 103419 0.010641 Eaton No Damage 34.194752 -118.152726 240 W TERRACE ST, ALTADENA, CA 91001
107964 108956 0.010633 Eaton Destroyed (>50%) 34.193742 -118.153420 246 W MARIPOSA ST, ALTADENA, CA 91001
103385 104377 0.010633 Eaton No Damage 34.194811 -118.152857 248 W TERRACE ST, ALTADENA, CA 91001
102611 103603 0.010633 Eaton No Damage 34.194375 -118.152754 233 W MARIPOSA ST, ALTADENA, CA 91001
102853 103845 0.010626 Eaton Destroyed (>50%) 34.194255 -118.153114 247 W MARIPOSA ST, ALTADENA, CA 91001
103880 104872 0.010626 Eaton No Damage 34.194638 -118.152415 224 W TERRACE ST, ALTADENA, CA 91001
106753 107745 0.010618 Eaton Destroyed (>50%) 34.193504 -118.153238 2805 GLEN AVE, ALTADENA, CA 91001
104874 105866 0.010618 Eaton Destroyed (>50%) 34.193794 -118.153167 238 W MARIPOSA ST, ALTADENA, CA 91001
106413 107405 0.010618 Eaton Destroyed (>50%) 34.193356 -118.153548 2789 GLEN AVE, ALTADENA, CA 91001
103421 104413 0.010618 Eaton No Damage 34.194221 -118.152951 239 W MARIPOSA ST, ALTADENA, CA 91001
105725 106717 0.010603 Eaton Destroyed (>50%) 34.193686 -118.153261 238 W MARIPOSA ST, ALTADENA, CA 91001
102938 103930 0.010595 Eaton No Damage 34.194069 -118.152496 215 W MARIPOSA ST, ALTADENA, CA 91001
105244 106236 0.010588 Eaton Destroyed (>50%) 34.193748 -118.153041 232 W MARIPOSA ST, ALTADENA, CA 91001
106114 107106 0.010588 Eaton Destroyed (>50%) 34.193400 -118.153262 2797 GLEN AVE, ALTADENA, CA 91001
102530 103522 0.010588 Eaton No Damage 34.194197 -118.152806 233 W MARIPOSA ST, ALTADENA, CA 91001
107173 108165 0.010588 Eaton Destroyed (>50%) 34.193272 -118.153374 2789 GLEN AVE, ALTADENA, CA 91001
102344 103336 0.010580 Eaton Destroyed (>50%) 34.194474 -118.153085 247 W MARIPOSA ST, ALTADENA, CA 91001
107160 108152 0.010580 Eaton Destroyed (>50%) 34.193496 -118.152622 206 W MARIPOSA ST, ALTADENA, CA 91001
102308 103300 0.010580 Eaton Destroyed (>50%) 34.194446 -118.152885 239 W MARIPOSA ST, ALTADENA, CA 91001
104413 105405 0.010580 Eaton Destroyed (>50%) 34.193430 -118.153532 2797 GLEN AVE, ALTADENA, CA 91001