## Class "Spatial" [package "sp"]
## 
## Slots:
##                               
## Name:         bbox proj4string
## Class:      matrix         CRS
## 
## Known Subclasses: 
## Class "SpatialPoints", directly
## Class "SpatialMultiPoints", directly
## Class "SpatialGrid", directly
## Class "SpatialLines", directly
## Class "SpatialPolygons", directly
## Class "SpatialPointsDataFrame", by class "SpatialPoints", distance 2
## Class "SpatialPixels", by class "SpatialPoints", distance 2
## Class "SpatialMultiPointsDataFrame", by class "SpatialMultiPoints", distance 2
## Class "SpatialGridDataFrame", by class "SpatialGrid", distance 2
## Class "SpatialLinesDataFrame", by class "SpatialLines", distance 2
## Class "SpatialPixelsDataFrame", by class "SpatialPoints", distance 3
## Class "SpatialPolygonsDataFrame", by class "SpatialPolygons", distance 2

Data for Analysis

1. Utah Surface Management Agency (SMA)

a. This dataset was collected by the BLM and retrieved from the Department of Interior’s website. The layers in this dataset are polygon features depicting the surface area of Utah and which management agency oversees them.

2. United States Wildfires

a. This dataset was collected by the US Forest Service and retrieved from the Department of Agriculture’s website. The layers within this vector point shapefile depict points where wildfires have occurred since 1984.

c. A boundary shapefile map is available for this dataset; however, the file size is very large so the vector point file will be used as this study is only interest in general location and counts of wildfires. For those interested in exporting this data in the form of a KML/KMZ output, the file size and complexity is too great.

d. This dataset inventories wildfires greater than 500 acres in the eastern U.S. and 1,000 acres in the western U.S. While the study does show some limitation in that wildfires are confirmed using satellite imagery and smaller fires are omitted, the dataset is quite extensive and accurate showing recent historical wildfires. On a greater scale of time and history, this dataset does not provide a greater long-term depiction of wildfire behavior.

3. Land Cover

a. This dataset was collected by the SWReGAP project which is a multi-institutional group. The layers within this raster file depict landcover over a five state region to assess biodiversity.

c. The generalizability of this study is outstanding in expressing numerous land cover types validated and assessed across multiple agency groups. It is interesting to note the study is limited to 5 states in the intermountain west and southern U.S. as inclusion of northern states in the intermountain region would reduce bias to land cover types dominate in southern climates.

Load and Clean Data:

1. Surface Management (SMA) Shapefile

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\joshc\Documents\USU\2023\GEOG 4870\FInal Project\GEOG4870 Final\Final Data\SMA\utah_sma.shp", layer: "utah_sma"
## with 12009 features
## It has 9 fields

Data Cleaned

Drop column 4 as it was all NA

Drop missing values within rows

Remove duplicate rows

2. Wildfire Shapefile

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\joshc\Documents\USU\2023\GEOG 4870\FInal Project\GEOG4870 Final\Final Data\US Wildfires\S_USA.MTBS_FIRE_OCCURRENCE_PT.shp", layer: "S_USA.MTBS_FIRE_OCCURRENCE_PT"
## with 29926 features
## It has 22 fields
## Integer64 fields read as strings:  MAP_ID DNBR_OFFST DNBR_STDDV

Data Cleaned

Drop columns with no data: wildfires <- wildfires[, -c(14,17,20)]

Drop missing values within rows: wildfires <- wildfires %>% na.omit()

Remove duplicate rows: wildfires <- (wildfires[!duplicated(wildfires$FIRE_ID), ])

## class       : SpatialPointsDataFrame 
## features    : 29926 
## extent      : -166.0911, -65.35063, 17.95656, 70.13926  (xmin, xmax, ymin, ymax)
## variables   : 19
## # A tibble: 29,926 × 19
##    FIRE_ID       FIRE_NAME ASMNT_TYPE FIRE_TYPE NODATA_THR GREENNESS_ LOW_THRESH
##    <chr>         <chr>     <chr>      <chr>          <dbl>      <dbl>      <dbl>
##  1 AZ3359910999… MUD SPRI… Initial    Wildfire        -970       -150        100
##  2 TX3129409771… OWL CREEK Initial    Wildfire        -970       -150        110
##  3 FL2686508190… UNNAMED   Initial (… Unknown         9999       9999        400
##  4 CO4001210858… SPRING C… Extended   Wildfire        -970       -150         80
##  5 AZ3586611185… ANDERSON  Extended   Wildfire        -970       -150         20
##  6 OR4334712281… WILLIAMS… Extended   Wildfire        -970       -150        110
##  7 FL2685008188… UNNAMED   Initial (… Unknown         9999       9999        400
##  8 FL2763008118… UNNAMED   Initial (… Unknown         9999       9999        350
##  9 UT3779310991… COOPER S… Extended   Wildfire        -970       -150        120
## 10 WA4649412121… DISCOVERY Extended   Wildfire        -970       -150        100
## # ℹ 29,916 more rows
## # ℹ 12 more variables: MODERATE_T <dbl>, HIGH_THRES <dbl>, IG_DATE <chr>,
## #   LATITUDE <dbl>, LONGITUDE <dbl>, ACRES <dbl>, MAP_ID <chr>, MAP_PROG <chr>,
## #   DNBR_OFFST <chr>, DNBR_STDDV <chr>, PRE_ID <chr>, POST_ID <chr>

3. Land Cover

Data Preparation

Research questions and necessary data.

1. Which areas in the U.S. are at most risk of wildfires?

a. us.cities data

b. Fire point locations within the US

2. Has there been an increase in large wildfires since 1984?

a. Wildfire Shapefile

b. Key variables within the dataset are the years that individual wildfires occur and how large they are measured in the “acres”.

4. Brainstorming

1. Frequency of large wildfires:

a. Filter wildfires that are well above the average size and visualize any change in the frequency of large fires overtime.

2. General wildfire trends.

a. Separate the entire dataset into multiple periods of time than compare wildfire change on line charts throughout each time period.

DATA EXPLORATION

5. Summarize key variables of interest.

Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 50 Pctl. 75 Max
ACRES 29926 6918 25127 500 1168 1930 4213 1068802
year 29926 2006 10 1984 1999 2008 2014 2021
GREENNESS_ 29926 2984 4691 -650 -150 -150 9999 9999
LOW_THRESH 29926 14 1148 -9999 40 75 164 9999
MODERATE_T 29926 23 6612 -9999 -120 277 400 9999
HIGH_THRES 29926 1048 8058 -9999 -9999 550 9999 9999
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 50 Pctl. 75 Max
ACRES 846 8078 26515 502 1184 2050 4606 333571
year 846 2008 8.6 1990 2000 2010 2015 2020
GREENNESS_ 846 2464 4442 -320 -150 -150 9999 9999
LOW_THRESH 846 64 936 -9999 46 75 150 740
MODERATE_T 846 848 6121 -9999 150 292 449 9999
HIGH_THRES 846 1622 7727 -9999 -87 580 9999 9999
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 50 Pctl. 75 Max
ACRES 29926 6918 25127 500 1168 1930 4213 1068802
year 29926 2006 10 1984 1999 2008 2014 2021
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 50 Pctl. 75 Max
ACRES 10056 6211 19009 500 1193 1994 4465 563527
year 10056 1994 5.6 1984 1989 1994 1999 2002
GREENNESS_ 10056 3581 4895 -500 -150 -150 9999 9999
LOW_THRESH 10056 55 1062 -9999 50 100 220 1100
MODERATE_T 10056 -1285 6060 -9999 -9999 223 335 9999
HIGH_THRES 10056 -696 7518 -9999 -9999 435 710 9999
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 50 Pctl. 75 Max
ACRES 19870 7276 27706 500 1153 1899 4083 1068802
year 19870 2012 5 2003 2008 2011 2016 2021
GREENNESS_ 19870 2682 4555 -650 -150 -150 9999 9999
LOW_THRESH 19870 -6 1188 -9999 40 65 150 9999
MODERATE_T 19870 685 6779 -9999 89 300 9999 9999
HIGH_THRES 19870 1931 8177 -9999 -9999 600 9999 9999

Summary Findings:

1. Data collection from 1990-2020 expresses wildfires ranging between 500 to 333,571 acres in size. The median fire size of 2,050 acres more accurately depicts the average size of wildfires than the mean of 8,078 acres which is likely skewed by outlying fires much large in scope than the majority of wildfires.

2. While the minimum and maximum year values were determined by the study period, the positive skew of the data towards later date periods is an indicator of an increasing frequency of wildfires towards the 2nd half of the study period.

6. Create a boxplot to identify outliers.

Boxplot of years 1984-2002 and 2003-2021:

Outlier Findings:

When comparing the first half of the study period vs. the 2nd, the total incidents of wildfires increased from 10,056 (1984-2002) to 19,870 wildfires (2003-2021)

7. Compute and visualize statistics for acres burned and year.

Summary statistics for the two most important varables in measuring wildfire size over time: Acres and Years. The study period was separated into two halves and 5 year increments to represent wildfire history from the 1st half of the study period and the 2nd half of the study period.

Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 50 Pctl. 75 Max
ACRES 846 8078 26515 502 1184 2050 4606 333571
year 846 2008 8.6 1990 2000 2010 2015 2020
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 50 Pctl. 75 Max
ACRES 10056 6211 19009 500 1193 1994 4465 563527
year 10056 1994 5.6 1984 1989 1994 1999 2002
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 50 Pctl. 75 Max
ACRES 19870 7276 27706 500 1153 1899 4083 1068802
year 19870 2012 5 2003 2008 2011 2016 2021

8. Compute and visualize correlation.

## [1] 0.02212699
## [1] 0.01879452

Correlation between size and time seperated into two time periods; 1984-2002 and 2003-2021.

## [1] 0.02876973
## [1] 0.002651568

The correlation analysis showed little correlation between the size of wildfires and timeframe burns occur. Only a slight positive correlation was found over the years.

9. Create an exploratory and multivariate visualization.

The analysis chart confirmed numerous outliers of wildfires vastly larger than median (2050); specifically in the years 1990, 2010, 2015, and 2020.

The line chart measuring the average size of wildfires from 1984 indicates a gradually increasing trend with a spike in acres burned from 2002-2005.

The line chart comparing prescribed burns vs wildfires indicates prescribed burnes have consistently burned at approximately 1,000 acres since 1984 while wildfires burned at approximately 15,000 acres on average in 1990 increasing to an approximate average of 20,000 acres in 2021.

The first half of the study period from 1984-2002 experienced an average increase in wildfire size from 5,700 to 8,800 acres . Approximately a 35% increase. The 2nd half of the study period (2003-2021) decreased from 9,200 acres on average to approximately 7,500 in 2021. A near 18% decrease in average size. While there seems to be a general increase in size overtime, a spike in large wildfires between 2003-2005 with a sudden decrease in the following years shows unpredictability in wildfire trends.

Spatial Integration and Visualization

Data Preperation

1. Check Projections

## [1] TRUE

2 Crop wildfires and cities to Utah. Merge two datasets together.

## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
##  [1] "TX" "OH" "CA" "GA" "NY" "OR" "NM" "LA" "VA" "PA" "FL" "IA" "AK" "IN" "MA"
## [16] "MI" "MD" "MN" "WI" "IL" "CO" "NC" "NJ" "AL" "WA" "ME" "AZ" "TN" "NE" "MT"
## [31] "MS" "ND" "MO" "ID" "UT" "KY" "CT" "OK" "NV" "WY" "SC" "WV" "NH" "AR" "RI"
## [46] "DE" "HI" "KS" "VT" "SD" "DC"
## [1] "UT"
## [1] "character"
## [1] "character"

3: Filter and create new dataset pertaining to wildfires over 150,000 acres.

4. Large Wildfire Distribution ( > 150,000 : United States)

5. Spatial Visualization of Wildfires Greater Than 150,000 Acres.

The line and point plots depict an increasing trend in the frequency of large wildfires over 150,000 acres. While the initial data collected in the 1980s showed more incidents than the 1990s, the point plot shows much greater frequency of large wildfires after 2005.